Lucia Barbieri

Examining the challenges for delivering voice and speech QoE
3G Optimisation Forum
Barbieri Lucia 28 03 2006
Agenda
1. Voice Key Performance Indicators 2. Voice QoE Evaluation Methods 3. Active Monitoring
Scenarios E2E KPIs Speech Quality: differences between GSM and W-CDMA networks
User perception of voice CODEX
4. Passive Monitoring 5. Conclusions
Presentation title in footer 24 June 2013
Confidentiality level on slide master Version number on slide master
Voice Key Performance Indicators

Call Set up Success Rate [%] Call Set up Time [s] Call Termination Success Ratio [%] Speech Quality Probability that the customer can access successfully the mobile telephony service when requested. Time needed by the customer to access the Mobile Telephony Service Probability that a correctly established voice call is normally released by the user (calling or called party). Indicator representing the quantification of the speech transmission quality as perceived by the user.
Voice QoE evaluated on these main KPIs
Voice QoE Evaluation Methods
Network Accessibility
Service Accessibility
Service Integrity
Service Retainability
Active Monitoring
E2E test carried out in stationary or mobility mode simulating customer behaviour. The complete end to end chain is measured.
Passive Monitoring
Evaluation based on network counters and passive probing, done on real customers traffic.
Active Monitoring Scenarios

E2E dedicated automated tools run call scenarios specially designed to simulate subscribers behavior in order to evaluate the QoE perceived by the end user. Different test scenarios are considered: . Mobility or Stationary mode Mobile to Mobile or Mobile to Fixed Line Mobile Originated or Mobile Terminated calls
Fixed line
CS Network
Active Monitoring KPIs

Call Set up Success Rate [%]: measured from RRC CONNECTION REQUEST message until the ALERTING message on the DCCH logical channel is passed from the MSC to the UE to indicate that the connection has been established. Call Set up Time [s]: measured from RRC CONNECTION REQUEST message until the ALERTING message on the DCCH logical channel is passed from the MSC to the UE to indicate that the connection has been established. Call Termination Success Ratio [%]: measured from the ALERTING message on the DCCH logical channel is passed from the MSC to the UE until the DISCONNECT towards network (terminal 1) and DISCONNECT or RELEASE from network (terminal 2) messages. Speech Quality [MOS-LQO]: Mean Opinion Score is the quantification of the end-to-end speech quality as perceived by the customer. It is according with a subjective measurement that is derived entirely by people listening to the calls and scoring the results from 1 to 5. The following five-point
category-judgement scale is defined by the ITU-T Rec. P.800 for listening quality assessment:
Quality of the speech Excellent Good Fair Poor Bad
6 Presentation title in footer 24 June 2013
Score 5 4 3 2 1
Speech Quality Evaluation Methods

To assess the customer perceived speech quality at least this two basic methodologies should be considered: Full Reference Method: this technique is based on a comparison of the original content (Reference) with what is received at the terminal (Processed). In order to use the FR approach, the Reference must be available at the measurement site (analysis can only be made for a limited set of pre-selected clips). This method has a high accuracy but excludes it for the analysis of live content. No Reference Method: this technique is based on an analysis of the Processed content without any knowledge of the Reference. NR metrics measure characteristic impairments through feature extraction and pattern matching techniques. It have a medium accuracy but it is ideally suited for quality measurement of live content.
Speech Quality Algorithms

ITU-T P.862 recommendation together with the related mapping given in ITUT Rec P.862.1 is an international standard for intrusive approach (full reference). This algorithm describes the opinion of customers related to voice transmission quality and its connected impairments (noise, robot voice, echo, dropouts etc). ITU-T P.563 is an international standard for non-intrusive voice quality evaluation (no reference approach)
1/3
Speech Quality: Differences between GSM and W-CDMA Networks

Test environment: Metropolitan and highway areas have been considered
More then 60000 speech samples (uplink and downlink direction) have been considered
Data have been collected during September December 2005 Handsets
Samsung Z107 configured in dual mode Samsung Z107 configured in single mode GSM
Speech Quality: Differences between GSM and W-CDMA Networks

Average LQ**
UL+DL UL DL Highway benchmark+NQT Highway benchmark+NQT Highway benchmark+NQT Samsung Z107 Samsung Z107 Samsung Z107 Samsung Z107 (GSM) (UMTS only) (ISHO) (dual mode) 3.65 3.61 3.16 3.63 3.66* 3.64 3.12 3.63 3.59 3.62 3.15 3.59 3.63* 3.66 3.04 3.65 3.69 3.61 3.18 3.66 3.69* 3.62 3.23 3.62
2/3
Highlights Globally Voice Quality over GSM is better than UMTS (only UMTS speech samples). Uplink 3G speech sample is on average better than 2G uplink. Downlink 2G speech sample is on average better than 3G downlink. Samsung dual mode voice performance is not suitable to benchmark radio access technology due to variable percentage of permanence on UMTS layer. Heavily bad contribution for speech sample affected by ISHO.
LQ cumulative distribution Samsung Z107 (2G and 3G only)

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 107 3G dl 107 3G ul 107 2G L dldl 107 2G L ulul
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.05
3.15
3.25
3.35
3.45
3.55
3.65
3.75
3.85
3.9
3.95
LQ
Figure referred to Benchmark + National Quality Test data collection
Further studies are planned to evaluate possible different handsets behaviour.
UMTS downlink behaviour should be improved in the way to be equivalent or better to the perceived GSM Voice Quality.
* Samsung Z107 forced on GSM network
** LQ is the output of SQUAD (MOS algorithm provided by Swissqual) UL and DL are respectively the speech sample direction from mobile station to network and from network to mobile station
10
MOS downlink vs Ec/Io

Without ISHO
3/3
1.5%
Downlink speech samples that should be improved are showed in the figure. All samples in the circle have Ec/Io (average) higher than ISHO threshold and MOS with poor intelligibility and quality.
1.0%
-22 -20 -18 -16 Ec/Io [dB] -14 -12 -10 -8 -6 -4 -2 4.4 4.6 4.8 3.6 3.8 4 4.2 3.4 3.2 3 2.6 2.8 1.8 2 2.2 2.4 1 1.2 1.4 1.6 0.8 0.6 0.4 0 0.2
0.5%
0.0%
LQ (MOS)
11

The figure beside shows the performance characterization of all 8 AMR Full Rate codec for different status of downlink radio path (C/I).
MOS 5.0 Experiment 1a - Test Results
1/2
4.0
The AMR characterization test results showed that the selected solution satisfies the AMR requirements in clean speech in Full Rate Channel. The results demonstrate that the combination of all 8 speech codec modes provide a robust Full Rate speech codec down to 4 dB C/I.
The three lowest codec modes are statistically unaffected by propagation errors down to 4 dB C/I. For further details see 3GPP TR 26.075
3.0 EFR 12.2 10.2 7.95 7.4 6.7 5.9 5.15 4.75 No Errors EFR 12.2 10.2 7.95 7.4 6.7 5.9 5.15 4.75 4.01 4.01 4.06 3.91 3.83 3.77 3.72 3.50 3.50 4.06 C/I=16 dB C/I=13 dB 4.01 4.13 3.96 4.01 3.94 C/I=10 dB 3.65 3.93 4.05 4.08 3.98 3.80 C/I= 7 dB 3.05 3.44 3.80 3.96 3.84 3.86 3.69 3.58 3.52 C/I= 4 dB 1.53 1.46 2.04 3.26 3.11 3.29 3.59 3.44 3.43 1.43 1.39 1.87 2.20 2.43 2.66
2.0
Conditions C/I= 1 dB
1.0
Note: MOS values are provided in these figures for information only. Mean Opinion Scores can only be representative of the test conditions in which they were recorded (speech material, speech processing, listening conditions, language, and cultural background of the listening subjects). Listening tests perfo rmed with other conditions than those used in the AMR characterization phase of testing could lead to a different set of MOS results. Finally, it should be noted that a difference of 0.2 MOS between two test results was usually found not statistically significant.
12

The figure beside shows the performance characterization of all 6 AMR Half Rate codec in clean speech and error conditions.
M OS 5.0
2/2
Experiment 1b - Test Results
4.0
The AMR Characterization test results showed that the selected solution complies with the AMR requirements in clean speech in Half Rate Channel.
The results demonstrate that the combination of all 6 speech codec modes provide a Half Rate speech codec equivalent to the ITU G.728 (16 kbit/s) speech codec down to 16 dB C/I. Furthermore, the results show that AMR can provide significantly better performances than GSM FR in the full range of test conditions, and significantly better performances than the GSM HR codec down to 7 dB C/I. For further details see 3GPP TR 26.075
3.0
2.0
EFR 7.95 7.4 6.7 5.9 5.15 4.75 FR HR Conditions No Errors 4.21 4.11 3.93 3.94 3.68 3.70 3.59 3.50 3.35 4.04 3.93 3.96 3.95 3.90 3.82 3.60 3.46 C/I=19 dB C/I=16 dB C/I=13 dB 4.21 3.37 3.52 3.53 3.72 3.60 3.42 3.50 C/I=10 dB 3.74 2.53 2.74 3.10 3.19 3.38 3.30 3.14 3.24 C/I= 7 dB 3.34 1.60 1.78 2.22 2.57 2.85 3.10 2.74 2.80 1.21 1.33 1.84 2.00 1.50 1.92 C/I= 4 dB 1.58
1.0 EFR 7.95 7.4 6.7 5.9 5.15 4.75 FR HR
Note: MOS values are provided in these figures for information only. Mean Opinion Scores can only be representative of the test conditions in which they were recorded (speech material, speech processing, listening conditions, language, and cultural background of the listening subjects). Listening tests perfo rmed with other conditions than those used in the AMR characterization phase of testing could lead to a different set of MOS results. Finally, it should be noted that a difference of 0.2 MOS between two test results was usually found not statistically significant.
13
Passive Monitoring of Voice QoE

Passive monitoring is mainly based on this sources: Network counters: measured and extracted from network elements. Passive probing: system capturing the signaling conveyed on the IuCS and Iub interfaces with the decoding of relevant protocols. Contrary to E2E indicators Passive KPIs are calculated on real customer traffic.
Network Counters
Iub
IuCS
RNC
MSC
PSTN
NodeBs
Passive Probe
14 Presentation title in footer 24 June 2013 Confidentiality level on slide master Version number on slide master
Conclusions
Active monitoring allows to measure real end to end customer QoE, however simulating user behavior just part of the network can be covered. Increase test frequency and measured geographical areas means huge investments. Passive monitoring allows to measure real customer traffic (all network), but it does not cover the complete end to end chain. Next step could be active monitoring on friendly users handsets.
15
Any Questions?
16
Thanks for the Attention!

Lucia Barbieri

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lucia Barbieri

Uploaded by

Copyright:

Available Formats

Examining the challenges for delivering voice and speech QoE

User perception of voice CODEX

4. Passive Monitoring 5. Conclusions

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Voice Key Performance Indicators

Voice QoE evaluated on these main KPIs

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Voice QoE Evaluation Methods

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Active Monitoring Scenarios

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Active Monitoring KPIs

Speech Quality Evaluation Methods

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Speech Quality Algorithms

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Speech Quality: Differences between GSM and W-CDMA Networks

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Speech Quality: Differences between GSM and W-CDMA Networks

LQ cumulative distribution Samsung Z107 (2G and 3G only)

Figure referred to Benchmark + National Quality Test data collection

Further studies are planned to evaluate possible different handsets behaviour.

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

MOS downlink vs Ec/Io

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

User perception of voice CODEX

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

User perception of voice CODEX

Experiment 1b - Test Results

1.0 EFR 7.95 7.4 6.7 5.9 5.15 4.75 FR HR

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Passive Monitoring of Voice QoE

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Presentation title in footer 24 June 2013

Confidentiality level on slide master Version number on slide master

Thanks for the Attention!

You might also like