Professional Documents
Culture Documents
SwissQual License AG Allmendweg 8 CH-4528 Zuchwil Switzerland t +41 32 686 65 65 f +41 32 686 65 66 e info@swissqual.com www.swissqual.com
SwissQual has made every effort to ensure that eventual instructions contained in the document are adequate and free of errors and omissions. SwissQual will, if necessary, explain issues which may not be covered by the documents. SwissQuals liability for any errors in the documents is limited to the correction of errors and the aforementioned advisory services. Copyright 2000 - 2012 SwissQual AG. All rights reserved. No part of this publication may be copied, distributed, transmitted, transcribed, stored in a retrieval system, or translated into any human or computer language without the prior written permission of SwissQual AG. Confidential materials. All information in this document is regarded as commercial valuable, protected and privileged intellectual property, and is provided under the terms of existing Non-Disclosure Agreements or as commercial-in-confidence material. When you refer to a SwissQual technology or product, you must acknowledge the respective text or logo trademark somewhere in your text. SwissQual, Seven.Five, SQuad, QualiPoc, NetQual, VQuad, Diversity as well as the following logos are registered trademarks of SwissQual AG.
Diversity Explorer, Diversity Ranger, Diversity Unattended, NiNA+, NiNA, NQAgent, NQComm, NQDI, NQTM, NQView, NQWeb, QPControl, QPView, QualiPoc Freerider, QualiPoc iQ, QualiPoc Mobile, QualiPoc Static, QualiWatch-M, QualiWatch-S, SystemInspector, TestManager, VMon, VQuad-HD are trademarks of SwissQual AG. SwissQual acknowledges the following trademarks for company names and products: Adobe, Adobe Acrobat, and Adobe Postscript are trademarks of Adobe Systems Incorporated. Apple is a trademark of Apple Computer, Inc. DIMENSION, LATITUDE, and OPTIPLEX are registered trademarks of Dell Inc. ELEKTROBIT is a registered trademark of Elektrobit Group Plc. Google is a registered trademark of Google Inc. Intel, Intel Itanium, Intel Pentium, and Intel Xeon are trademarks or registered trademarks of Intel Corporation. INTERNET EXPLORER, SMARTPHONE, TABLET are registered trademarks of Microsoft Corporation. Java is a U.S. trademark of Sun Microsystems, Inc. Linux is a registered trademark of Linus Torvalds. Microsoft, Microsoft Windows, Microsoft Windows NT, and Windows Vista are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries U.S. NOKIA is a registered trademark of Nokia Corporation. Oracle is a registered US trademark of Oracle Corporation, Redwood City, California. SAMSUNG is a registered trademark of Samsung Corporation. SIERRA WIRELESS is a registered trademark of Sierra Wireless, Inc. TRIMBLE is a registered trademark of Trimble Navigation Limited. U-BLOX is a registered trademark of u-blox Holding AG. UNIX is a registered trademark of The Open Group.
Contents
1 Introduction .......................................................................................................................................... 1 Overview ................................................................................................................................................ 1 Enabling 3G on a Mobile Phone ............................................................................................................ 1 Example............................................................................................................................................ 1 Forcing Into a Video Call ....................................................................................................................... 2 Supported Call Options and Voices Tests ....................................................................................... 2 2 Phone-to-Phone Video Telephony Settings ...................................................................................... 3 Speech Samples and Recording Time .................................................................................................. 3 Voice Test Results ................................................................................................................................. 4 Capturing a Video Call With a Camera.................................................................................................. 4 3 Using the H.324M Video Stack for Video Telephony ....................................................................... 5 Defining a Video Stack Job ................................................................................................................... 5 Results and Restrictions ........................................................................................................................ 6 Viewing Results in Real Time........................................................................................................... 6 Voice Transmission .......................................................................................................................... 7 Video Transmission .......................................................................................................................... 7 Lip Synchronization .......................................................................................................................... 7 Audio and Video Connect Trigger Points ......................................................................................... 7 Restrictions ....................................................................................................................................... 8 Asynchronous Link to the Phone With the V.80 Protocol ................................................................ 8 4 Video Telephony to Video Stack or to Responder ........................................................................... 9 Phone to Stack Option ........................................................................................................................... 9 Phone to Responder Option ................................................................................................................ 10
Figures
Figure 1-1 Overview of the components in a CS64K video call ........................................................................ 1 Figure 2-1 Phone to Phone test option in TestManager .................................................................................... 3 Figure 2-2 Record Hangover setting for Video Telephony ................................................................................ 3 Figure 3-1 Video Stack options in Test Manager .............................................................................................. 6 Figure 3-2 Record HO setting in Test Manager ................................................................................................ 6 Figure 3-3 Example schematic of the serial link from the Host PC to the phone .............................................. 8 Figure 4-1 Phone to Video Stack option in Test Manager................................................................................. 9 Figure 4-2 Phone to Responder option in Test Manager ................................................................................ 10 Figure 4-3 Temporal structure for an audio-video clip that is used in responder mode. ................................. 11
ii
Introduction
The Video Telephony Settings Manual describes the video telephony options for measurement tests on a Diversity system. For intrusive speech jobs, the system uses Video Stack mode to control both sides of the call. For nonintrusive jobs, that is, call to any number and access test, the system uses Phone mode.
Overview
Video telephony is a service that uses the CS64K channel in UMTS networks. After a phone establishes the physical channel, the logical media channels transfer the audio content, the video content, and the control protocol. Typically, a 3GP 324 M multiplexer provides these channels and uses a control protocol that is based on H.245. SwissQual Diversity software offers the following test modes for video telephony: Phone mode: Video telephony application runs on a mobile phone and the media content is provided through connectors, such as headset connectors. Video Stack mode: Complete video telephony application runs on a host PC Module (PCM) that uses a mobile phone as a dial-up device.
The advantages and disadvantages of these modes are discussed in the following sections.
During a video call, the voice signal is split into 20 ms frames and encoded by an audio encoder, which is usually AMR. The video signal is encoded frame by frame as H.263 or MPEG 4 (usually baseline). The 324 multiplexer splits this information into 160 byte packets with header and control information and then transmits the packets synchronously over the physical CS64K channel. A packet contains one complete AMR-encoded speech frame and part of a video frame. Since a complete video frame is split into several packets, the video content is less robust due to transmission errors.
Example
To enable 3G on a Nokia N95 mobile phone, select > Tools > Settings > Phone > Network >
1 CONFIDENTIAL MATERIALS
Chapter 1 | Introduction
Direction of A/V
Uplink (A->B) Downlink (B->A) Speech** Speech P.862
MOC
MTC
x x x x x
x x x (x)*
x x x
x x x x x
x x x x x
x x x x x
x x
x x
x x
The 3G 324M stack, which is normally part of the terminal software, enables the video telephony service over CS64K channels. This stack integrates the audio and video data multiplexer, the H.245 protocol, and also controls the audio and video encoders and decoders. Video stack mode bypasses the phone internal stack and runs these software components on the host PCM instead. This mode gives you complete control of the contents that you want to transmit in a video telephony connection. This level of control is required when you want to transmit and evaluate predefined voice and video clips as 'intrusive' or 'full-reference' quality measurements. In video stack mode, a fixed video clip of head and shoulders is transmitted. If you want, you can combine this clip with a voice clip that is in a supported format. The quality of the video clip is evaluated by the SwissQual VQuad test, which is a full-reference video quality model. The quality of the voice clip is evaluated with SQuad-LQ or optionally with ITU-T P.862.
In video stack mode, you can only use speech or speech-P.862 intrusive tests. Although you cannot change the default video clip, you can use any of the available speech reference samples.
Chapter 3 | Using the H.324M Video Stack for Video Telephony CONFIDENTIAL MATERIALS
4. On the General tab in the Details area, enter 12500 in the Record HO box. Important: Due to improvements in the VQuad algorithm, 12500 ms are sufficient for R97 and above.
The high Record HO value provides Diversity with enough time to record the complete signal. For example, when the signal delay is long, the PCM requires more time than usual to complete the high performance video evaluation. However, the high Record HO value also decreases the number of video telephony evaluations that can be performed. For example, in a 90 s call, a PCM only has time to complete four evaluations.
Voice Transmission
The default codec for the audio channel is AMR-FR (12.2 kbit/s). If you assume that a transmission is undistorted, you can expect SQuad-LQ values greater than 3.7. However, you can select any of the speech reference samples that were installed by SwissQual.
Video Transmission
For the video transmission, a 6-second fixed head-and-shoulders video clip with a corresponding speech clip is used. The voice and video clips are transmitted and recorded synchronously. If the speech clip is not transmitted, blue frames are fed into the video channel. Note: There is no content dependency between the speech and the video clips. The video clip is only a typical example that is interlaced with two meaningful sentences. The video signal is encoded with the ITU-T H.263 codec and has a fixed frame rate of 5 fps. The use of this codec, combined with the restricted transmission capacity, leads to a strong degradation of the video quality, even in perfect radio conditions.
Lip Synchronization
The term LipSync describes the de-synchronization that can occur between the video and the audio signals in a transmission. This condition is easy to observe when the source of the sound effect is visible in the video, for example, a speaker's face, a musician, or a smoking gun. The following situations can cause desynchronization during video streaming: Separate transmission of the audio and video signal De-multiplexing and re-multiplexing of the audio and video signals Player decodes the audio signal faster that the video signal Different buffering mechanisms
In CS64K video telephony, the audio and video data is multiplexed according to the 3G 342 M standard. To avoid de-synchronization during transmission, the SwissQual video telephony solution uses the 3G 324 M stack to feed the multiplexer directly with pre-encoded audio and video signals. The receiving end consists of a decoder and a simple player, which does not require buffering or other intelligence. As a result, desynchronization is eliminated and the LipSync value is assumed to be 0 ms.
The 3G 342 M uses the time between these two trigger points, which is normally a few seconds, for protocol exchange and acknowledgements. Diversity uses these trigger points to track the call progress with respect to call success as well as to calculate channel setup durations. Due to the fixed correspondence of the signals, the audio and video logical channels open synchronously (see Lip Synchronization on page 7). As a result, the 3G 324 M stack interprets the opening of these two channels as a single trigger point. Due to the amount of time that is required to decode a video frame, the first video image does not immediately appear. SwissQual uses a fixed frame rate of 5 fps to ensure that this image is captured at least 200 ms after the audio and video channels open. However, in reality, this value depends on the frame rate
Chapter 3 | Using the H.324M Video Stack for Video Telephony CONFIDENTIAL MATERIALS 7
Restrictions
Video coding and evaluation consumes a significant amount of power, particularly in a Diversity PCM. To reduce the power load on the PCM, you need to observe the following rules during an active measurement: Do not switch any of the debug or log files Do not remotely log in to a PCM, for example, with Remote Desktop Connection Do not start background activities such as file copying or virus scanning
You can only use the SwissQual video telephony solution with the following mobile phones: Nokia 6630 Nokia 6680 Nokia N80 Nokia N95 Nokia N6120/6121
Figure 3-3 Example schematic of the serial link from the Host PC to the phone
Chapter 3 | Using the H.324M Video Stack for Video Telephony CONFIDENTIAL MATERIALS
If you want, you can use a mixed configuration approach to video telephony. This approach involves the following video stack options: Phone to Video Stack: Mobile phone with an internal video stack calls a Diversity system that runs the video stack on a host PC. The stack sends out the common video clip and the voice clip that you specified in the speech test. Transmission is only supported in the video stack to phone direction, that is, video stack always sends and phone always receives. Phone to Responder: Mobile phone calls a responder station, which plays a video clip. The current solution only requires that the B-side answers the call.
If you also want to evaluate the video signal on the A-side, you need to install a camera to capture the display of the A-side device. Note: Diversity automatically enables camera capturing when you connect a camera to the system. To allow sufficient time to evaluate the video signal, you need to enter a Record HO value of 12500 ms. However, if the camera is disabled or you just want to evaluate the voice signal at the A-side, use the default
Chapter 4 | Video Telephony to Video Stack or to Responder CONFIDENTIAL MATERIALS 9
The responder cannot be controlled by the signal that is played. As a result, this signal must exactly match the timings of the test and the test files of the calling A-side. That is, the signal must include the synchronization sequence at the beginning of the transmission as well as all voice and video clips that the job expects. If you also want to evaluate the video signal on the A-side, you need to install a camera to capture the display of the A-side device. Note: Diversity automatically enables camera capturing when you connect a camera to the system. In Figure 4-3, a common example for a pre-defined video clip, a sufficient time slot for evaluating the video signal has already been allotted for the clip generation by the responder.
10
Video track
Audio track
Sync sequence
4.5s
12.5s
12.5s
12.5s
Figure 4-3 Temporal structure for an audio-video clip that is used in responder mode.
The pre-defi ned clip matches the exact structure between the sync sequence and the successive clips. In this series, Diversity analyses every second clip, where the clip length is 6 s and the RecordHO is 18500ms. However, the frequency of clips in the signal is higher and can be equipped for high seed analysis, which, similar to the QualComm FFA phone, can be configured to avoid muting during long silence intervals. Important: You can only use this option in conjunction with a QualComm FFA QPT6800 phone, which uses a packet switched connection and a SIP VoIP client on the phone for video telephony. This video telephony client allows the playback of pre-stored video files. If you want to emulate the Phone to Responder option in the Phone to Phone option, you need to add the following entry to the Windows registry:
11