You are on page 1of 17

Examining the challenges for delivering voice and speech QoE

3G Optimisation Forum
Barbieri Lucia 28 03 2006

Agenda
1. Voice Key Performance Indicators 2. Voice QoE Evaluation Methods 3. Active Monitoring
Scenarios E2E KPIs Speech Quality: differences between GSM and W-CDMA networks

User perception of voice CODEX

4. Passive Monitoring 5. Conclusions

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Voice Key Performance Indicators


Call Set up Success Rate [%] Call Set up Time [s] Call Termination Success Ratio [%] Speech Quality Probability that the customer can access successfully the mobile telephony service when requested. Time needed by the customer to access the Mobile Telephony Service Probability that a correctly established voice call is normally released by the user (calling or called party). Indicator representing the quantification of the speech transmission quality as perceived by the user.

Voice QoE evaluated on these main KPIs

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Voice QoE Evaluation Methods

Network Accessibility

Service Accessibility

Service Integrity

Service Retainability

Active Monitoring
E2E test carried out in stationary or mobility mode simulating customer behaviour. The complete end to end chain is measured.

Passive Monitoring
Evaluation based on network counters and passive probing, done on real customers traffic.

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Active Monitoring Scenarios


E2E dedicated automated tools run call scenarios specially designed to simulate subscribers behavior in order to evaluate the QoE perceived by the end user. Different test scenarios are considered: . Mobility or Stationary mode Mobile to Mobile or Mobile to Fixed Line Mobile Originated or Mobile Terminated calls

Fixed line

CS Network

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Active Monitoring KPIs


Call Set up Success Rate [%]: measured from RRC CONNECTION REQUEST message until the ALERTING message on the DCCH logical channel is passed from the MSC to the UE to indicate that the connection has been established. Call Set up Time [s]: measured from RRC CONNECTION REQUEST message until the ALERTING message on the DCCH logical channel is passed from the MSC to the UE to indicate that the connection has been established. Call Termination Success Ratio [%]: measured from the ALERTING message on the DCCH logical channel is passed from the MSC to the UE until the DISCONNECT towards network (terminal 1) and DISCONNECT or RELEASE from network (terminal 2) messages. Speech Quality [MOS-LQO]: Mean Opinion Score is the quantification of the end-to-end speech quality as perceived by the customer. It is according with a subjective measurement that is derived entirely by people listening to the calls and scoring the results from 1 to 5. The following five-point

category-judgement scale is defined by the ITU-T Rec. P.800 for listening quality assessment:
Quality of the speech Excellent Good Fair Poor Bad
6 Presentation title in footer 24 June 2013

Score 5 4 3 2 1
Confidentiality level on slide master Version number on slide master

Speech Quality Evaluation Methods


To assess the customer perceived speech quality at least this two basic methodologies should be considered: Full Reference Method: this technique is based on a comparison of the original content (Reference) with what is received at the terminal (Processed). In order to use the FR approach, the Reference must be available at the measurement site (analysis can only be made for a limited set of pre-selected clips). This method has a high accuracy but excludes it for the analysis of live content. No Reference Method: this technique is based on an analysis of the Processed content without any knowledge of the Reference. NR metrics measure characteristic impairments through feature extraction and pattern matching techniques. It have a medium accuracy but it is ideally suited for quality measurement of live content.

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Speech Quality Algorithms


ITU-T P.862 recommendation together with the related mapping given in ITUT Rec P.862.1 is an international standard for intrusive approach (full reference). This algorithm describes the opinion of customers related to voice transmission quality and its connected impairments (noise, robot voice, echo, dropouts etc). ITU-T P.563 is an international standard for non-intrusive voice quality evaluation (no reference approach)

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

1/3

Speech Quality: Differences between GSM and W-CDMA Networks


Test environment: Metropolitan and highway areas have been considered

More then 60000 speech samples (uplink and downlink direction) have been considered
Data have been collected during September December 2005 Handsets
Samsung Z107 configured in dual mode Samsung Z107 configured in single mode GSM

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Speech Quality: Differences between GSM and W-CDMA Networks


Average LQ**
UL+DL UL DL Highway benchmark+NQT Highway benchmark+NQT Highway benchmark+NQT Samsung Z107 Samsung Z107 Samsung Z107 Samsung Z107 (GSM) (UMTS only) (ISHO) (dual mode) 3.65 3.61 3.16 3.63 3.66* 3.64 3.12 3.63 3.59 3.62 3.15 3.59 3.63* 3.66 3.04 3.65 3.69 3.61 3.18 3.66 3.69* 3.62 3.23 3.62

2/3

Highlights Globally Voice Quality over GSM is better than UMTS (only UMTS speech samples). Uplink 3G speech sample is on average better than 2G uplink. Downlink 2G speech sample is on average better than 3G downlink. Samsung dual mode voice performance is not suitable to benchmark radio access technology due to variable percentage of permanence on UMTS layer. Heavily bad contribution for speech sample affected by ISHO.

LQ cumulative distribution Samsung Z107 (2G and 3G only)


100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 107 3G dl 107 3G ul 107 2G L dldl 107 2G L ulul

3.1

3.2

3.3

3.4

3.5

3.6

3.7

3.8

3.05

3.15

3.25

3.35

3.45

3.55

3.65

3.75

3.85

3.9

3.95

LQ

Figure referred to Benchmark + National Quality Test data collection

Further studies are planned to evaluate possible different handsets behaviour.

UMTS downlink behaviour should be improved in the way to be equivalent or better to the perceived GSM Voice Quality.
* Samsung Z107 forced on GSM network
** LQ is the output of SQUAD (MOS algorithm provided by Swissqual) UL and DL are respectively the speech sample direction from mobile station to network and from network to mobile station

10

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

MOS downlink vs Ec/Io


Without ISHO

3/3

1.5%

Downlink speech samples that should be improved are showed in the figure. All samples in the circle have Ec/Io (average) higher than ISHO threshold and MOS with poor intelligibility and quality.

1.0%

-22 -20 -18 -16 Ec/Io [dB] -14 -12 -10 -8 -6 -4 -2 4.4 4.6 4.8 3.6 3.8 4 4.2 3.4 3.2 3 2.6 2.8 1.8 2 2.2 2.4 1 1.2 1.4 1.6 0.8 0.6 0.4 0 0.2

0.5%

0.0%

LQ (MOS)

11

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

User perception of voice CODEX


The figure beside shows the performance characterization of all 8 AMR Full Rate codec for different status of downlink radio path (C/I).
MOS 5.0 Experiment 1a - Test Results

1/2

4.0

The AMR characterization test results showed that the selected solution satisfies the AMR requirements in clean speech in Full Rate Channel. The results demonstrate that the combination of all 8 speech codec modes provide a robust Full Rate speech codec down to 4 dB C/I.
The three lowest codec modes are statistically unaffected by propagation errors down to 4 dB C/I. For further details see 3GPP TR 26.075

3.0 EFR 12.2 10.2 7.95 7.4 6.7 5.9 5.15 4.75 No Errors EFR 12.2 10.2 7.95 7.4 6.7 5.9 5.15 4.75 4.01 4.01 4.06 3.91 3.83 3.77 3.72 3.50 3.50 4.06 C/I=16 dB C/I=13 dB 4.01 4.13 3.96 4.01 3.94 C/I=10 dB 3.65 3.93 4.05 4.08 3.98 3.80 C/I= 7 dB 3.05 3.44 3.80 3.96 3.84 3.86 3.69 3.58 3.52 C/I= 4 dB 1.53 1.46 2.04 3.26 3.11 3.29 3.59 3.44 3.43 1.43 1.39 1.87 2.20 2.43 2.66

2.0

Conditions C/I= 1 dB

1.0

Note: MOS values are provided in these figures for information only. Mean Opinion Scores can only be representative of the test conditions in which they were recorded (speech material, speech processing, listening conditions, language, and cultural background of the listening subjects). Listening tests perfo rmed with other conditions than those used in the AMR characterization phase of testing could lead to a different set of MOS results. Finally, it should be noted that a difference of 0.2 MOS between two test results was usually found not statistically significant.

12

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

User perception of voice CODEX


The figure beside shows the performance characterization of all 6 AMR Half Rate codec in clean speech and error conditions.
M OS 5.0

2/2

Experiment 1b - Test Results

4.0

The AMR Characterization test results showed that the selected solution complies with the AMR requirements in clean speech in Half Rate Channel.
The results demonstrate that the combination of all 6 speech codec modes provide a Half Rate speech codec equivalent to the ITU G.728 (16 kbit/s) speech codec down to 16 dB C/I. Furthermore, the results show that AMR can provide significantly better performances than GSM FR in the full range of test conditions, and significantly better performances than the GSM HR codec down to 7 dB C/I. For further details see 3GPP TR 26.075

3.0

2.0

EFR 7.95 7.4 6.7 5.9 5.15 4.75 FR HR Conditions No Errors 4.21 4.11 3.93 3.94 3.68 3.70 3.59 3.50 3.35 4.04 3.93 3.96 3.95 3.90 3.82 3.60 3.46 C/I=19 dB C/I=16 dB C/I=13 dB 4.21 3.37 3.52 3.53 3.72 3.60 3.42 3.50 C/I=10 dB 3.74 2.53 2.74 3.10 3.19 3.38 3.30 3.14 3.24 C/I= 7 dB 3.34 1.60 1.78 2.22 2.57 2.85 3.10 2.74 2.80 1.21 1.33 1.84 2.00 1.50 1.92 C/I= 4 dB 1.58

1.0 EFR 7.95 7.4 6.7 5.9 5.15 4.75 FR HR

Note: MOS values are provided in these figures for information only. Mean Opinion Scores can only be representative of the test conditions in which they were recorded (speech material, speech processing, listening conditions, language, and cultural background of the listening subjects). Listening tests perfo rmed with other conditions than those used in the AMR characterization phase of testing could lead to a different set of MOS results. Finally, it should be noted that a difference of 0.2 MOS between two test results was usually found not statistically significant.

13

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Passive Monitoring of Voice QoE


Passive monitoring is mainly based on this sources: Network counters: measured and extracted from network elements. Passive probing: system capturing the signaling conveyed on the IuCS and Iub interfaces with the decoding of relevant protocols. Contrary to E2E indicators Passive KPIs are calculated on real customer traffic.
Network Counters

Iub

IuCS

RNC

MSC

PSTN

NodeBs

Passive Probe
14 Presentation title in footer 24 June 2013 Confidentiality level on slide master Version number on slide master

Conclusions
Active monitoring allows to measure real end to end customer QoE, however simulating user behavior just part of the network can be covered. Increase test frequency and measured geographical areas means huge investments. Passive monitoring allows to measure real customer traffic (all network), but it does not cover the complete end to end chain. Next step could be active monitoring on friendly users handsets.

15

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Any Questions?

16

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Thanks for the Attention!

You might also like