Professional Documents
Culture Documents
Delivery Manager
British Telecom
SIT Alumnus Telecom Branch Year 2000 (1st batch)
Syllabus
Module IV
Service Assurance: 7 Hrs
Proactive service assurance
Reactive service assurance
Customer and network Fault / Fault Management
Trouble Ticket Management
Topology & Configuration Management
Planning & Testing
Performance Management
SLA Management
Service Assurance
Do you get annoyed when you cannot reach your friend on your mobile phone?
And what about that time when you sent a message to your colleague that you
were running late and it turned out that she never got it?
Today your mobile phone can handle more services than just voice telephony.
A lot of effort is put into introducing services such as video call, mobile TV,
streaming, etc.
As a user you want to be certain that when you pay for a service it will work
properly!
No matter whether it is because you want to make sure that you will be able to
see that particular episode of your favourite programme or because you want to
be certain that you can reach your business partner at any time.
As an operator, there are a number of advantages of being able to measure the
performance in order to be able to guarantee a certain quality of the services. One
of these advantages is the possibility to prioritize traffic flows of customers that
generate high revenue streams.
Service Assurance
Service assurance, in telecommunications, is the application of policies and
processes by a Communications Service Provider (CSP) to ensure that services
offered over networks meet a pre-defined service quality level for an optimal
subscriber experience.
There are majorly 2 types of service assurance processes:
Proactive assurance:
Includes policies and processes to proactively pinpoint, diagnose and resolve
service quality degradations or device malfunctions before subscribers are
impacted. This is done before ay complaints raised by the customers.
Reactive assurance:
Includes policies and processes to pinpoint, diagnose and resolve service quality
degradations or device malfunctions. This will be done as a part of trouble ticket
shooting after the customer has made a complaint on the service.
quality.
When price and service features become more and more similar between different service
providers a key differentiator for the customer will be quality. It will improve customer loyalty and
the end-user will be more eager to use the service.
It will also reduce churn and influence the possibilities for additional services in a positive way.
By providing an expected quality of the services, the service provider might get the image of a
trusted and respected provider.
There will always be a segment of the market that is more interested in high quality and
reliability, and is willing to pay more in order to get it.
The revenues can also be increased through the possibility to reduce the Time-To-Market of a
service.
If the network is overloaded it is possible to prioritize the services that generate high revenue
streams.
The operational expenditure (OPEX) will decrease because of the improved operational
processes.
Problems can also be solved proactively through early alarms about service performance
degradations. This enables the operator to discover problems and resolve them before they get too
big. Thus the cost for resolving problems can be reduced.
Surveying and maintaining the network will be more effective since personnel at all levels will
have the same information about service quality and usage.
The operator can also reduce over provisioning of network resources since measurements of the
6
actual service quality can be made.
These performance parameters are called Service Level Objectives (SLOs) and indicate that
e.g. the response time or availability of the service should be kept at a certain level.
The actual value of the parameters is then measured and compared with the value stated in
the SLO.
The agreement often states penalties that should be paid in case of non-compliance.
10
11
It is then aggregated in the Data Refinement layer and KPIs are defined.
Thresholds are set up in order to be able to issue service alarms and reports.
The status of the service components is presented to the Service Level Manager in the form
of events and statistical reports.
The Service Level Manager in turn, gathers the information and correlates it if necessary in
order to create an either general or individual performance view.
The KPIs are then compared with the pre-defined Service Level Objectives (SLOs) agreed
upon in the SLA.
While the Service Level Manager provides a real-time view, the SLO reporting function
provides a historical view. It is used for follow-up and final analysis of the service
performance and SLA agreements.
12
Fault Management
Fault Event
Performance Event
Alarmed
Non-Alarmed
Critical
Major
Minor
Warning
Defects
Anomalies
13
14
Fault Management
15
Fault Management
Functions of Fault Management System:
Network monitoring + alarm management + advanced alarm processing function
Fault diagnostic/ root cause analysis/ troubleshooting
Maintaining historical alarm logs
Trouble ticketing
Proactive fault management
16
Fault Management
Network Monitoring:
Allow a network provider organization:
To see whether the network is operating as expected
To keep track of its current state
To visualize the current state
The most important task consists of collecting alarms and making sure that nothing
important is missed
17
Fault Management
Basic Alarm Management Function:
Fault Management
Basic Alarm Management Function:
Historical alarm data can be useful
To resolve future problems faster by recognizing patterns and recalling their past
resolution
To establish trends, to see how alarm rates and types of alarms reported have evolved
over time
The clearing of alarm a second alarm to indicate that the alarm condition no longer
exists
19
Fault Management
Advanced Alarm Management Function:
20
Fault Management
Alarm and Event Filtering
Filtering - To block out as many irrelevant or less important event as possible or
redundant alarm
Subscribe only needed alarms as specified by some criteria
Deduplication : discard the redundant alarms within a time
Correlation To preprocess and aggregate data from events and alarms distill it into more
concise and meaningful information
21
Fault Management
Fault Diagnostic and Trouble shooting
The analysis process that leads to a diagnosis is often referred to as a root cause analysis
For example: Device Overheating
Troubleshooting can involve simply
retrieving additional monitoring data about a device.
Injecting some tests into a network or a device
Loopback tests
Ping / traceroute
22
Fault Management
Proactive Fault Management
Most fault management is reactive
Deal with faults after they have occurred
Proactive fault management
Taking a step to avoid failure conditions before they occur
Test network to detect deterioration in the quality of service
Alarm analysis that recognizes pattern of alarms caused by minor faults that point to
bigger problems
Trouble Ticket
The trouble ticket system helps keep track of which trouble tickets are still outstanding
Trouble tickets are assigned to operator who are responsible for resolving the trouble
ticket
Not every alarm results in trouble ticket, only when alarm conditions having impact to
deliver services or need human intervention
23
Fault Management
Process Flow:
24
Fault Management
25
Fault Management
26
Fault Management
27
Fault Management
Need for Fault Management system:
Limitations of human resources
Various Network Element vendors in the same network environment
Technology and network are more complex
Huge number of alarms from huge number of network elements
Need to find the root cause and solve the problems faster
28
29
The problems encountered by customers can vary from customer to customer, place to
place, but on a broader level Trouble Tickets can be categorized into Telecom exchange
end or Customer equipment end, which can be further subdivided to categories on the
basis of skill set requirements.
30
This allows the tester to inform more accurately about how long will it take for a request
to be processed and performed. The tester sends the customer an email to notify; when
the work will be completed.
31
If the fault is similar to the one already reported then it is marked as a repeat Trouble
Ticket.
Customer is provided with the Trouble Ticket ID so that customer can check the status of
the Trouble Ticket.
A Trouble Ticket application should have a workflow automation engine to test necessary
tasks such as line test, network test to detect the cause and category of the problem
reported.
Provide regular updates to client. The Client should receive the regular updates
pertaining to current status of the trouble ticket.
Fix an appointment in case In case the resolution requires Customer site visit,
Maintain the trouble Ticket Closure reports.
33
Problem Subject
34
35
36
Synchronization
Reconciliation
The network is considered as the master of information
The information should reflect what is actually in the network
37
Discrepancy reporting
The decision of how to synchronize is made by the user
38
Image Management
How to keep track of software images are installed on which network device
How to deliver new images to those devices without disrupting service
39
Common Uses
Engage in clients operational support systems (OSS) development lifecycle for new
product roll-outs and/or application enhancements
Acceleration of launch intervals
Optimize and introduce methods and procedures (M&P) and training materials
Deliver a focused and methodical approach to ensuring a higher degree of quality in OSS
applications and associated business processes for the business end user
Coordinates various stakeholder organizations and IT teams for planning and execution
phases
40
Manage and capture results to identify defects and enhancements to systems and
processes
Smoother integration and adoption by end-users
Achieve productivity and efficiency gains through identification of future process and
tool enhancements
41
Performance Management
The primary function of an ideal performance management system is to optimize the use of
the network and applications so as to provide a consistent and predictable level of service.
Once this goal is established, then the focus of performance management is to optimize the
performance of the network and applications in order to comply with service-level
agreements.
Performance management involves the following:
Configuring data-collection methods and network testing
Collecting performance data
Optimizing network service response time
Proactive management and reporting
Managing the consistency and quality of network services
Performance management is the measurement of network and application traffic for the
purpose of providing a consistent and predictable level of service at a given instance and
across a defined period of time.
Performance management involves monitoring the network, application, and service
activity and adjusting designs and configurations in order to meet performance
requirements or improve performance and traffic management.
42
43
44
45
47
48
49
A general baseline includes all areas of the network, such as a connectivity diagram,
inventory details, device configurations, software versions, application/disk/CPU/memory
utilization, link bandwidth, and so on.
In summary, the objective of base lining is to create a knowledge base of the networkand
keep it up to date.
SLA Management
Service Level Agreement Management is responsible for establishing, reviewing and
cancellation of Service Level Agreements (SLAs) with customer.
Service Level Agreements
are based on Service Design
are negotiated and agreed with the customer
need the Supply Management Process for the supply of services (agreed in supplier
contracts)on service levels (SLAs) from external partners (if needed)
need the Supply Management Process for the supply of services (agreed in Ucs i.e.,
underpinning contracts ) on service levels (OLAs: Operating level agreement. An internal
agreement supporting the SLA requirements ) from internal partners (if needed)
The purpose of Service Level Agreement Management is to manage Service Level
Agreements in a way that customer requirements are reflected and contracts are
coordinated and harmonized. Basic requirement is to balance the value and quality for the
customer with the costs of service.
Service Agreement Management contributes to an integrated Service Management approach
by achieving the following goals:
Every service provided to a customer is covered by an SLA containing a description of the
guaranteed and agreed service level.
To achieve the service level targets, OLAs and UCs are established in support of the SLAs
by the Supply Management Process.
51
SLA Management
High Level Process Chart
This chart illustrates the Service Agreement Management process and its activities as well as
the status model reflected by the service record evolution.
52
SLA Management
SLA Requirements Engineering
Service Level Agreement Staff forms together with the CRM Staff a team and decide on who is
the requirements engineering agent. This agent gets in contact with the customer and defines
customer requirements. This person needs to be an expert concerning the service offered by
the company, to match customer requirements with existing services or to define if required
services are possible to be delivered in commercial and technical view. A close cooperation
with the Service Design Process, CRM and all Operation Processes is necessary.
The final regiments need to be written and agreed with the customer. An requirement
document is provided.
SLA Management
SLA Negotiation
Requirements are prices and then discussed with the customer. Different options of service
delivery are discussed. In addition for each service a service level is defined, proceed and
discussed. This pricing of diverse service levels is supported by the Service Level Management
Process.
A final agreement version is distributed and discussed between all involved parties.
Activity Specific Rules:
trigger Service Level Management Process for price information on diverse Service level of
Service to customer
combine all information to offer/ agreement
discuss with customer
if final version agreed - provide final contract version to all parties for signing
set status on "negotiated"
54
SLA Management
SLA Agreement
Final version of the contract is checked by all parties including now the check layer. Final
version is distributed, signed and documented in the contract/ agreement data base (see CRM
Process).
Activity Specific Rules:
Provide final check of agreement including check by law
Sign contract
Document contract by providing is to the CRM Process
Set status on "agreed"
SLA Monitoring
Existing agreements need to be monitored:
All aspects NOT involving the service quality are monitored by the CRM process
All service quality concerning aspects are monitored by Service Level Management Process
55
SLA Management
SLA Monitoring (Cntd..)
Activity Specific Rules:
monitor quality of service -> trigger Service Level Management Process for monitoring,
receive and analyse information
monitor all other contract aspects -> trigger CRM Process, receive and analyse information
monitor agreement for end of service (based on date, conditions or customer request)
if end of service
set status on "end of service"
continue with next activity
SLA Reporting
Service level reports, used by both the business and the IT department, contain the
monitoring data used to measure performance against objectives.
56
SLA Management
SLA Review
The service level agreement is formalized in a review procedure: the service level agreement
review (SLA Review). The SLA Review is a two-way communication between the IT
department and the organization. It ensures that the services are being delivered efficiently
and are optimized to meet the organization's requirements.
SLA Expiration
Contract cancellation is done in the CRM Process due to complete legal situation in case of
contract cancellation. Service Level Agreement Management is triggered by CRM process and
the monitoring of a agreement is stopped.
Activity Specific Rules:
stop monitoring of agreement on trigger from CRM process
set agreement on "expired"
57
SLA Management
Penalties
SLA also includes defining penalty clauses if the Service Level as agreed in SLA is not met.
The penalties for non-conformance should be detailed, but emphasis should be made on
motivation rather than to antagonize the situation. These may range from notification of:
Lost fees
Repayment of fees
Compensation for lost earnings
Termination
Combination of the above
Invoking penalty clauses does not necessarily gain great benefit, in that the damage is already
done and any monetary penalty is unlikely to compensate even partially for business lost or
damage to brand image. Terminating and moving to another provider may not necessarily
improve matters; such action will lose the goodwill (if any) and cumulative knowledge gained
between the enterprise and provider. For internal parties, penalties will not necessarily act as
an incentive to remedy the condition once failed. Incentives to rectify the problem as quickly
as possible should be considered to offset any penalties, to encourage co-operation at a likely
stressful time and bonuses for over-performance.
58
References
OSS/ BSS for Converged Telecommunication Networks A practical approach
By Rahul Wargad
http://telecombillingbasics.blogspot.in/2010/10/trouble-ticket-management-intelecom.html
http://www.cisco.com/c/en/us/products/collateral/services/highavailability/white_paper_c11-478096.pdf
http://www.afutt.org/Qostic/qostic1/SLA-DI-USG-TMF-060091-SLA_TMForum.pdf
http://www.mitsm.de/service-level-agreement-management-en
http://www.google.co.in/url?sa=t&rct=j&q=&esrc=s&frm=1&source=web&cd=1&cad=rja&ua
ct=8&ved=0CBwQFjAA&url=http%3A%2F%2Fwww.itu.int%2Fitudoc%2Fitut%2Fworkshop%2Foptical%2Fs9amp01_pp7.ppt&ei=ETpiVO2xBoHOmwWs4GwCw&usg=AFQjCNHePsPwKm5A6dvLp9LSkQnLry6T3g&sig2=o2cZ306_ZAYzKHQX7842w
59
Questions???
60
Thank You!
61