You are on page 1of 29

CASE STUDY

NAME : TANMAY MEHTA

COURSE : MBA TECH

BRANCH : TELECOM

ROLL NO : 527

PREFACE

The acronym MPEG stands for Moving Picture Expert Group, which worked to generate the specifications under ISO, the International Organization for Standardization and IEC, the International Electrotechnical Commission. What is commonly referred to as "MPEG video" actually consists at the present time of two finalized standards, MPEG-1 and MPEG-2, with a third standard, MPEG-4, in the process of being finalized . The MPEG-1 & -2 standards are similar in basic concepts. They both are based on motion compensated blockbased transform coding techniques, while MPEG-4 deviates from these more traditional approaches in its usage of software image construct descriptors, for target bit-rates in the very low range, < 64Kb/sec. Because MPEG-1 & -2 are finalized standards and are both presently being utilized in a large number of applications, this case study concentrates on compression techniques relating only to these two standards. MPEG 3- it was originally anticipated that this standard would refer to HDTV applications, but it was found that minor extensions to the MPEG-2 standard would suffice for this higher bitrate, higher resolution application, so work on a separate MPEG-3 standard was abandoned.

CONTENTS

*Introduction

*History *Video Compression *Video Quality *MPEG *MPEG Standards *MPEG Video Compression Technology *MPEG Specification *System -Elementary Stream -System Clock Referance -Program Streams -Presentation Time Stamps -Decoding Time Stamps -Multiplexing

*Video -Resolution -Bitrate

-I frame -P frame -B frame -D frame -Macroblock -Motion Vectors

*Audio *Illustration 1 : 32 sub band filter bank *Illustration 2 : FFT analysis


*Discrete Cosine Transform

*Important Tables *Applications *Referances

INTRODUCTION

MPEG video compression is used in many current and emerging products. It is at the heart of digital television set-top boxes, DSS, HDTV decoders, DVD players, video conferencing, Internet video, and other applications. These applications benefit from video compression in the fact that they may require less storage space for archived video information, less bandwidth for the transmission of the video information from one point to another, or a combination of both. Besides the fact that it works well in a wide variety of applications, a large part of its popularity is that it is defined in two finalized international standards, with a third standard currently in the definition process. It is the purpose of this case study to introduce to the basics of MPEG video compression, from both an encoding and a decoding perspective.

HISTORY

Modeled on the successful collaborative approach and the compression technologies developed by the Joint Photographic Experts Group and CCITT's Experts Group on Telephony (creators of the JPEG image compression standard and the H.261 standard for video conferencing respectively) the Moving Picture Experts Group (MPEG) working group was established in January 1988. MPEG was formed to address the need for standard video and audio formats, and build on H.261 to get better quality through the use of more complex encoding methods. Development of the MPEG-1 standard began in May 1988. 14 video and 14 audio codec proposals were submitted by individual companies and institutions for evaluation. The codecs were extensively tested for computational complexity and subjective (human perceived) quality, at data rates of 1.5 Mbit/s. This specific bitrate was chosen for transmission over T-1/E-1 lines and as the approximate data rate of audio CDs. The codecs that excelled in this testing were utilized as the basis for the standard and refined further, with additional features and other improvements being incorporated in the process.

After 20 meetings of the full group in various cities around the world, and 4 years of development and testing, the final standard (for parts 1-3) was approved in early November 1992 and published a few months later. The reported completion date of the MPEG-1 standard , varies greatly: a largely complete draft standard was produced in September 1990, and from that point on, only minor changes were introduced. The draft standard was publicly available for purchase. The standard was finished with the 6 November 1992 meeting. The Berkeley Plateau Multimedia Research Group developed a MPEG-1 decoder in November 1992. In July 1990, before the first draft of the MPEG-1 standard had even been written, work began on a second standard, MPEG-2,[ intended to extend MPEG-1 technology to provide full broadcastquality video (as per CCIR 601) at high bitrates (3 - 15 Mbit/s), and support for interlaced video. Due in part to the similarity between the two codecs , the MPEG-2 standard includes full backwards compatibility with MPEG-1 video, so any MPEG-2 decoder can play MPEG-1 videos. Notably, the MPEG-1 standard very strictly defines the bitstream , and decoder function, but does not define how MPEG-1 encoding is to be performed . This means that MPEG-1 coding efficiency can drastically vary depending on the encoder used, and generally means that newer encoders perform significantly better than their predecessors. The first three parts (Systems, Video and Audio) of ISO/IEC 11172 were published in August 1993.

VIDEO COMPRESSION
Video compression refers to reducing the quantity of data used to represent digital video images, and is a combination of spatial image compression and temporal motion compensation. Video compression is an example of the concept of source coding in Information theory. This case study deals with its applications: compressed video can effectively reduce the bandwidth required to transmit video via terrestrial broadcast, via cable TV, or via satellite TV services.

VIDEO QUALITY

Most video compression is lossy it operates on the premise that much of the data present before compression is not necessary for achieving good perceptual quality. For example, DVDs use a video coding standard called MPEG-2 that can compress around two hours of video data by 15 to 30 times, while still producing a picture quality that is generally considered high-quality for standard-definition video. Video compression is a trade off between disk space, video quality, and the cost of hardware required to decompress the video in a reasonable time. However, if the video is over compressed in a lossy manner, visible (and sometimes distracting) artifacts can appear.

Video compression typically operates on square-shaped groups of neighboring pixels, often called macroblocks. These pixel groups or blocks of pixels are compared from one frame to the next and the video compression codec (encode/decode scheme) sends only the differences within those blocks. This works extremely well if the video has no motion. A still frame of text, for example, can be repeated with very little transmitted data. In areas of video with more motion, more pixels change from one frame to the next. When more pixels change, the video compression scheme must send more data to keep up with the larger number of pixels that are changing.

VIDEO COMPRESSION TECHNOLOGY

At its most basic level, compression is performed when an input video stream is analyzed and information that is indiscernible to the viewer is discarded. Each event is then assigned a code - commonly occurring events are assigned few bits and rare events will have codes more bits. These steps are commonly called signal analysis, quantization and variable length encoding respectively. There are four methods for compression, discrete cosine transform (DCT), vector quantization (VQ), fractal compression, and discrete wavelet transform (DWT). Discrete cosine transform is a lossy compression algorithm that samples an image at regular intervals, analyzes the frequency components present in the sample, and discards those frequencies which do not affect the image as the human eye perceives it. DCT is the basis of standards such as JPEG, MPEG, H.261, and H.263. We covered the definition of both DCT and wavelets in our tutorial on Wavelets Theory.

Vector quantization is a lossy compression that looks at an array of data, instead of individual values. It can then generalize what it sees, compressing redundant data, while at the same time retaining the desired object or data stream's original intent.

Fractal compression is a form of VQ and is also a lossy compression. Compression is performed by locating self-similar sections of an image, then using a fractal algorithm to generate the sections.

Like DCT, discrete wavelet transform mathematically transforms an image into frequency components. The process is performed on the entire image, which differs from the other methods (DCT), that work on smaller pieces of the desired data. The result is a hierarchical representation of an image, where each layer represents a frequency band.

MPEG

MOVING PICTURE EXPERTS GROUP

The Moving Picture Experts Group (MPEG) was formed by the ISO to set standards for audio and video compression and transmission. It was established in 1988 and its first meeting was in May 1988 in Ottawa, Canada. As of late 2005, MPEG has grown to include approximately 350 members per meeting from various industries, universities, and research institutions. MPEG's official designation is ISO/IEC JTC1/SC29 WG11 - Coding of moving pictures and audio .

STANDARDS

The MPEG standards consist of different Parts. Each part covers a certain aspect of the whole specification. The standards also specify Profiles and Levels. Profiles are intended to define a set of tools that are available, and Levels define the range of appropriate values for the properties associated with them. MPEG has standardized the following compression formats and ancillary standards:

MPEG-1 (1993): Coding of moving pictures and associated audio for digital storage media at up to about 1.5 Mbit/s (ISO/IEC 11172). The first

MPEG compression standard for audio and video. It was basically designed to allow moving pictures and sound to be encoded into the bitrate of a Compact Disc. It is used on Video CD, SVCD and can be used for low-quality video on DVD Video. It was used in digital satellite/cable TV services before MPEG-2 became widespread. To meet the low bit requirement, MPEG-1 downsamples the images, as well as uses picture rates of only 2430 Hz, resulting in a moderate quality. It includes the popular Layer 3 (MP3) audio compression format.

MPEG-2 (1995): Generic coding of moving pictures and associated audio information. (ISO/IEC 13818) Transport, video and audio standards for broadcast-quality television. MPEG-2 standard was considerably broader in scope and of wider appeal supporting interlacing and high definition. MPEG-2 is considered important because it has been chosen as the compression scheme for over-the-air digital television ATSC, DVB and ISDB, digital satellite TV services like Dish Network, digital cable television signals, SVCD, DVD Video and Blu-ray. MPEG-3: MPEG-3 dealt with standardizing scalable and multiresolution compression and was intended for HDTV compression but was found to be redundant and was merged with MPEG-2, as a result there is no MPEG-3 standard. MPEG-3 is not to be confused with MP3, which is MPEG-1 Audio Layer 3. MPEG-4 (1998): Coding of audio-visual objects. (ISO/IEC 14496) MPEG-4 uses further coding tools with additional complexity to achieve higher compression factors than MPEG-2. In addition to more efficient coding of video, MPEG-4 moves closer to computer graphics applications. In more complex profiles, the MPEG-4 decoder effectively becomes a rendering processor and the compressed bitstream describes three-dimensional shapes and surface texture. MPEG-4 also provides Intellectual Property Management and Protection (IPMP) which provides the facility to use proprietary technologies to manage and protect content like digital rights management. Several new higher-efficiency video standards (newer than MPEG-2 Video) are included

In addition, the following standards, while not sequential advances to the video encoding standard as with MPEG-1 through MPEG-4, are referred to by similar notation:

MPEG-7 (2002): Multimedia content description interface. (ISO/IEC 15938) MPEG-21 (2001): Multimedia framework (MPEG-21). (ISO/IEC 21000) MPEG describes this standard as a multimedia framework and provides for intellectual property management and protection.

Moreover, more recently than other standards above, MPEG has started following international standards; each of the standards holds multiple MPEG technologies for a way of application. (For example, MPEG-A includes a number of technologies on multimedia application format.)

MPEG-A (2007): Multimedia application format (MPEG-A). (ISO/IEC 23000) (e.g. Purpose for multimedia application formats, MPEG music player application format, MPEG photo player application format and others) MPEG-B (2006): MPEG systems technologies. (ISO/IEC 23001) (e.g. Binary MPEG format for XML, Fragment Request Units, Bitstream Syntax Description Language (BSDL) and others) MPEG-C (2006): MPEG video technologies. (ISO/IEC 23002) (e.g. Accuracy requirements for implementation of integer-output 8x8 inverse discrete cosine transform and others) MPEG-D (2007): MPEG audio technologies. (ISO/IEC 23003) (e.g. MPEG Surroundand two parts under development: SAOC-Spatial Audio

Object Coding and USAC-Unified Speech and Audio Coding)

MPEG-E (2007): Multimedia Middleware. (ISO/IEC 23004) (a.k.a. M3W) (e.g. Architecture, Multimedia application programming interface (API), Component model and others)

MPEG VIDEO COMPRESSION TECHNIQUE


A MPEG "film" is a sequence of three kinds of frames: The I-frames are intra coded, i.e. they can be reconstructed without any reference to other frames. The Pframes are forward predicted from the last I-frame or Pframe, i.e. it is impossible to reconstruct them without the data of another frame (I or P). The B-frames are both, forward predicted and backward predicted from the last/next I-frame or P-frame, i.e. there are two other frames necessary to reconstruct them. P-frames and B-frames are referred to as inter coded frames.

As an example the frame sequence above is transfered in the following order: I P B B B P B B B. The only task of the decoder is to reorder the reconstructed frames. To support this an ascending frame number comes with each frame . What does "prediction" mean? Imagine an I-frame showing a triangle on white background. A following P-frame shows the same triangle but at another position. Prediction means to supply a motion vector which declares how to move the triangle on Iframe to obtain the triangle in P-frame. This motion vector is part of the MPEG stream and it is divided in a horizontal and a vertical part. These parts can be positive or negative. A positive value means motion to the right or motion downwards, respectively. A negative value means motion to the left or motion upwards, respectively. The parts of the motion vector are in an range of -64 ... +63. So the referred area can be up to 64x64 pixels away. But this model assumes that every change between frames can be expressed as a simple displacement of pixels. But the figure to the right shows this isn't true. The rectangle is shifted and rotated by 5 to the right. So a simple displacement of the

rectangle will cause a prediction error. Therefore the MPEG stream contains a matrix for compensating this prediction error. Thus, the reconstruction of inter coded frames goes ahead in two steps:
1. 2.

Application of the motion vector to the referred frame; Adding the prediction error compensation to the result;

the prediction error compensation requires less bytes than the whole frame because the white parts are zero and can be discarded from MPEG stream. Furthermore the DCT compression (see later in this chapter) is applied to the prediction error which decreases its memory size. Note also the different meanings of the two + - signs. The first means adding the motion vector to the x-, y- coordinates of each pixel. The second means adding an error value to the color value of the appropriate pixel.

what if some parts move to the left and others to the right ? The motion vector isn't valid for the whole frame. Instead of this the frame is divided into macro blocks of 16x16 pixels. Every macro block has its own motion vector. Of course, this does not avoid contradictory motion but it minimizes its probability. And if contradictory motion occurs? One of the greatest misunderstandings of the MPEG compression technique is to assume that all macro blocks of P-frames are predicted. If the prediction error is to big the coder can decide to intracode a macro block. Similarly the macro blocks in B-frames can be forward predicted or backward predicted or forward and backward predicted or intra-coded. Every macro block contains 4 luminance blocks and 2 chrominance blocks. Every block has a dimension of 8x8 values. The luminance blocks contain information of the brightness of every pixel in macro block. The chrominance blocks contain color information. Because of some properties of the human eye it isn't necessary to give color information for every pixel. Instead 4 pixels are related to one color value. This color value is divided into two parts. The first is in Cb color block the second is in Cr color block. The color information is to be applied as shown in the picture to the

left. Depending on the kind of macro block the blocks contain pixel information or prediction error information as mentioned above. In any case the information is compressed using the discrete cosine transform (DCT).

MPEG SPECIFICATION Part 1: Systems


Part 1 of the MPEG-1 standard covers systems, and is defined in ISO/IEC11172-1. MPEG-1 Systems specifies the logical layout and methods used to store the encoded audio, video, and other data into a standard bitstream, and to maintain synchronization between the different contents. This file format is specifically designed for storage on media, and transmission over data channels, that are considered relatively reliable. Only limited error protection is defined by the standard, and small errors in the bitstream may cause noticeable defects.

Elementary streams
Elementary streams (ES) are the raw bitstreams of MPEG-1 audio and video, output by an encoder. These files can be distributed on their own, such as is the case with MP3 files. Additionally, elementary streams can be made more robust by packetizing them, i.e., dividing them into independent chunks, and adding a cyclic redundancy check (CRC) checksum to each segment for error detection. This is the Packetized Elementary Stream (PES) structure.

System Clock Reference (SCR) is a timing value stored in a 33-bit


header of each ES, at a frequency/precision of 90 kHz, with an extra 9-bit extension that stores additional timing data with a precision of 27 MHz. These are inserted by the encoder, derived from the system time clock (STC). Simultaneously encoded audio and video streams will not have identical SCR values, however, due to buffering, encoding, jitter, and other delay.

Program streams
Program Streams (PS) are concerned with combining multiple packetized elementary streams (usually just one audio and video PES) into a single stream, ensuring simultaneous delivery, and maintaining synchronization. The PS structure is known as a multiplex, or a container format.

Presentation time stamps (PTS) exist in PS to correct the inevitable


disparity between audio and video SCR values (time-base correction). 90 kHz PTS values in the PS header tell the decoder which video SCR values match which audio SCR values. PTS determines when to display a portion of an MPEG program, and is also used by the decoder to determine when data can be discarded from the buffer. Either video or audio will be delayed by the decoder until the corresponding segment of the other arrives and can be decoded.

Decoding Time Stamps (DTS), additionally, are required because of


B-frames. With B-frames in the video stream, adjacent frames have to be encoded and decoded out-of-order (re-ordered frames). DTS is quite similar to PTS, but instead of just handling sequential frames, it contains the proper timestamps to tell the decoder when to decode and display the next B-frame (types

of frames explained below), ahead of its anchor (P- or I-) frame. Without Bframes in the video, PTS and DTS values are identical

Multiplexing
To generate the PS, the multiplexer will interleave the (two or more) packetized elementary streams. This is done so the packets of the simultaneous streams can be transferred over the same channel and are guaranteed to both arrive at the decoder at precisely the same time. This is a case of time-division multiplexing.

Part 2: Video
Part 2 of the MPEG-1 standard covers video and is defined in ISO/IEC-111722. The design was heavily influenced by H.261. MPEG-1 Video exploits perceptual compression methods to significantly reduce the data rate required by a video stream. It reduces or completely discards information in certain frequencies and areas of the picture that the human eye has limited ability to fully perceive. It also utilizes effective methods to exploit temporal (over time) and spatial (across a picture) redundancy common in video, to achieve better data compression than would be possible otherwise.

Color space

Example of 4:2:0 subsampling. The two overlapping center circles represent chroma blue and chroma red (color) pixels, while the 4 outside circles represent the luma (brightness). Before encoding video to MPEG-1, the color-space is transformed to Y'CbCr (Y'=Luma, Cb=Chroma Blue, Cr=Chroma Red). Luma (brightness, resolution) is stored separately from chroma (color, hue, phase) and even further separated into red and blue components. The chroma is also subsampled to 4:2:0, meaning it is decimated by one half vertically and one half horizontally, to just one quarter the resolution of the video.[1] Because the human eye is much less sensitive to small changes in color than in brightness, chroma subsampling is a very effective way to reduce the amount of video data that needs to be compressed. On videos with fine detail (high spatial complexity) this can manifest as chroma aliasing artifacts. Compared to other digital compression artifacts, this issue seems to be very rarely a source of annoyance. Because of subsampling, Y'CbCr video must always be stored using even dimensions (divisible by 2), otherwise chroma mismatch will occur, and it will appear as if the color is ahead of, or behind the rest of the video, much like a shadow. Y'CbCr is often inaccurately called YUV which is only used in the domain of analog video signals. Similarly, the terms luminance and chrominance are often used instead of the (more accurate) terms luma and chroma.

Resolution/Bitrate
MPEG-1 supports resolutions up to 40954095 (12-bits), and bitrates up to 100 Mbit/s. MPEG-1 videos are most commonly seen using Source Input Format (SIF) resolution: 352x240, 352x288, or 320x240. These low resolutions, combined with a bitrate less than 1.5 Mbit/s, make up what is known as a constrained parameters bitstream (CPB), later renamed the "Low Level" (LL) profile in MPEG-2. This is the minimum video specifications any decoder should be able to handle, to be considered MPEG-1 compliant. This was selected to provide a

good balance between quality and performance, allowing the use of reasonably inexpensive hardware of the time.

Frame/picture/block types
MPEG-1 has several frame/picture types that serve different purposes. The most important, yet simplest are I-frames.

I-frames
I-frame is an abbreviation for Intra-frame, so-called because they can be decoded independently of any other frames. They may also be known as Ipictures, or keyframes due to their somewhat similar function to the key frames used in animation. I-frames can be considered effectively identical to baseline JPEG images. High-speed seeking through an MPEG-1 video is only possible to the nearest Iframe. When cutting a video it is not possible to start playback of a segment of video before the first I-frame in the segment (at least not without computationally-intensive re-encoding). For this reason, I-frame-only MPEG videos are used in editing applications. I-frame only compression is very fast, but produces very large file sizes: a factor of 3 (or more) larger than normally encoded MPEG-1 video, depending on how temporally complex a specific video is. I-frame only MPEG-1 video is very similar to MJPEG video. So much so that very high-speed and theoretically lossless conversion can be made from one format to the other, provided a couple of restrictions (color space and quantization matrix) are followed in the creation of the bitstream.

P-frames
P-frame is an abbreviation for Predicted-frame. They may also be called forward-predicted frames, or inter-frames P-frames exist to improve compression by exploiting the temporal (over time) redundancy in a video. P-frames store only the difference in image from the frame (either an I-frame or P-frame) immediately preceding it (this reference frame is also called the anchor frame).

The difference between a P-frame and its anchor frame is calculated using motion vectors on each macroblock of the frame (see below). Such motion vector data will be embedded in the P-frame for use by the decoder.

B-frames
B-frame stands for bidirectional-frame. They may also be known as backwards-predicted frames or B-pictures. B-frames are quite similar to Pframes, except they can make predictions using both the previous and future frames (i.e. two anchor frames). It is therefore necessary for the player to first decode the next I- or P- anchor frame sequentially after the B-frame, before the B-frame can be decoded and displayed. This makes B-frames very computationally complex, requires larger data buffers, and causes an increased delay on both decoding and during encoding. This also necessitates the display time stamps (DTS) feature in the container/system stream (see above). As such, B-frames have long been subject of much controversy, they are often avoided in videos, and are sometimes not fully supported by hardware decoders. No other frames are predicted from a B-frame. Because of this, a very low bitrate B-frame can be inserted, where needed, to help control the bitrate. If this was done with a P-frame, future P-frames would be predicted from it and would lower the quality of the entire sequence.

D-frames
MPEG-1 has a unique frame type not found in later video standards. D-frames or DC-pictures are independent images (intra-frames) that have been encoded DC-only and hence are very low quality. D-frames are never referenced by I-, P- or B- frames. D-frames are only used for fast previews of video, for instance when seeking through a video at high speed. Given moderately higher-performance decoding equipment, this feature can be approximated by decoding I-frames instead. This provides higher quality previews, and without the need for D-frames taking up space in the stream, yet not improving video quality.

Macroblocks
MPEG-1 operates on video in a series of 8x8 blocks for quantization. However, because chroma (color) is subsampled by a factor of 4, each pair of (red and blue) chroma blocks corresponds to 4 different luma blocks. This set of 6 blocks, with a resolution of 16x16, is called a macroblock. A macroblock is the smallest independent unit of (color) video. Motion vectors (see below) operate solely at the macroblock level.

Motion vectors
To decrease the amount of spatial redundancy in a video, only blocks that change are updated, (up to the maximum GOP size). This is known as conditional replenishment. However, this is not very effective by itself. Movement of the objects, and/or the camera may result in large portions of the frame needing to be updated, even though only the position of the previously encoded objects has changed. Through motion estimation the encoder can compensate for this movement and remove a large amount of redundant information. The encoder compares the current frame with adjacent parts of the video from the anchor frame (previous I- or P- frame) in a diamond pattern, up to a (encoder-specific) predefined radius limit from the area of the current macroblock. If a match is found, only the direction and distance (i.e. the vector of the motion) from the previous video area to the current macroblock need to be encoded into the inter-frame (P- or B- frame). The reverse of this process, performed by the decoder to reconstruct the picture, is called motion compensation. Motion vectors record the distance between two areas on screen based on the number of pixels (called pels). MPEG-1 video uses a motion vector (MV) precision of one half of one pixel, or half-pel. The finer the precision of the MVs, the more accurate the match is likely to be, and the more efficient the compression. There are trade-offs to higher precision, however. Finer MVs result in larger data size, as larger numbers must be stored in the frame for every single MV, increased coding complexity as increasing levels of interpolation on the macroblock are required for both the encoder and decoder, and diminishing

returns (minimal gains) with higher precision MVs. Half-pel was chosen as the ideal trade-off.

Part 3: Audio
Part 3 of the MPEG-1 standard covers audio and is defined in ISO/IEC-111723. MPEG-1 Audio utilizes psychoacoustics to significantly reduce the data rate required by an audio stream. It reduces or completely discards certain parts of the audio that the human ear can't hear, either because they are in frequencies where the ear has limited sensitivity, or are masked by other (typically louder) sounds.

Visualization of the 32 sub-band filter bank used by MPEG-1 Audio, showing the disparity between the equal band-size of MP2 and the varying width of critical bands ("barks"). The 32 sub-band filter bank returns 32 amplitude coefficients, one for each equal-sized frequency band/segment of the audio, which is about 700 Hz wide (depending on the audio's sampling frequency). The encoder then utilizes the

psychoacoustic model to determine which sub-bands contain audio information that is less important, and so, where quantization will be inaudible, or at least much less noticeable.

Example FFT analysis on an audio wave sample.

DISCRETE COSINE TRANSFORM

In general, neighboring pixels within an image tend to be highly correlated. As such, it is desired to use an invertible transform to concentrate randomness into fewer, decorrelated parameters. The Discrete Cosine Transform (DCT) has been shown to be near optimal for a large class of images in energy concentration and decorrelating. The DCT decomposes the signal into underlying spatial frequencies, which then allow further processing techniques to reduce the precision of the DCT coefficients consistent with the Human Visual System (HVS) model. The DCT/IDCT transform operations are described with Equations 1 & 2 respectively4:

Equation 1: Forward Discrete Cosine Transform

Equation 2: Inverse Discrete Cosine Transform

An interesting summary of typical hardware requirements is given in the following table

video profile

typical decoder transistor count total dram

DRam bus width, speed 16 bits, 80 ns 64 bits, 80 ns 64 bits, 80 ns N/A

MPEG-1 CPB MPEG-1 601 MPEG-2 MP@ML MPEG-2 MP@HL

0.4-0.75 million 0.8-1.1 million 0.9-1.5 million 2.0-3.0 million

4Mb 16Mb 16 Mb 64 Mb

MPEG FILE FORMAT SUMMARY


Type Colors Compression Audio/video data storage Up to 24-bits (4:2:0 YCbCr color space) DCT and block-based scheme with motion compensation

Maximum Image Size Multiple Images Per File Numerical Format Originator

4095x4095x30 frames/second

Yes (multiple program multiplexing)

NA Motion Picture Experts Group (MPEG) of the International Standards Organization (ISO) All Xing Technologies MPEG player, others

Platform Supporting Applications

Usage Stores an MPEG-encoded data stream on a digital storage medium. MPEG is used to encode audio, video, text, and graphical data within a single, synchronized data stream. Comments MPEG-1 is a finalized standard in wide use. MPEG-2 is still in the development phase and continues to be revised for a wider base of applications. Currently, there are few stable products available for making practical use of the MPEG standard, but this is changing.

APPLICATIONS

Most popular computer software for video playback includes MPEG-1 decoding, in addition to any other supported formats.

The popularity of MP3 audio has established a massive installed base of hardware that can play back MPEG-1 Audio (all three layers). "Virtually all digital audio devices" can play back MPEG-1 Audio.[30] Many millions have been sold to-date. Before MPEG-2 became widespread, many digital satellite/cable TV services used MPEG-1 exclusively.[9][19] The widespread popularity of MPEG-2 with broadcasters means MPEG-1 is playable by most digital cable and satellite set-top boxes, and digital disc and tape players, due to backwards compatibility. MPEG-1 is the exclusive video and audio format used on Video CD (VCD), the first consumer digital video format, and still a very popular format around the world. The Super Video CD standard, based on VCD, uses MPEG-1 audio exclusively, as well as MPEG-2 video. The DVD-Video format uses MPEG-2 video primarily, but MPEG-1 support is explicitly defined in the standard. The DVD-Video standard originally required MPEG-1 Layer II audio for PAL countries, but was changed to allow AC-3/Dolby Digital-only discs. MPEG-1 Layer II audio is still allowed on DVDs, although newer extensions to the format, like MPEG Multichannel, are rarely supported. Most DVD players also support Video CD and MP3 CD playback, which use MPEG-1. The international Digital Video Broadcasting (DVB) standard primarily uses MPEG-1 Layer II audio, and MPEG-2 video. The international Digital Audio Broadcasting (DAB) standard uses MPEG-1 Layer II audio exclusively, due to MP2's especially high quality, modest decoder performance requirements, and tolerance of errors.

REFERANCES

-Elements Of Data Compression


ADAM DROZDEK

by

-Introduction To Data Compression KHALEED SYUOOD

by

-Wikipedia

-Encarta

------------------------------------------------------------------------------------

You might also like