Professional Documents
Culture Documents
mshreve@cse.usf.edu
Outline
Basics of Video
Digital Video
MPEG
Summary
Basics of Video
Static scene capture Image
Bring in motion Video
Image sequence: A 3-D signal
2 spatial dimensions & 1 time dimension
Continuous I (x, y, t) discrete I (m, n, tk)
Video Camera
Frame-by-frame capturing
CCD sensors (Charge-Coupled Devices)
Interlaced
Refreshed twice every frame; the little gun at the back of your
CRT shoots all the correct phosphors on the even numbered rows
of pixels first and then odd numbered rows
NTSC frame-rate of 29.97 means the screen is redrawn 59.94
times a second
In other words, 59.94 half-frames per second or 59.94 fields per
second
DIGITAL VIDEO
Why Digital?
Exactness
Exact reproduction without degradation
Accurate duplication of processing result
Spatial redundancy
Pixels in a neighborhood have close luminance levels
Low frequency
Random access
Solved by encoding frame without prediction at constant intervals of
time
Bit allocation
according to statistics
constant and variable bit-rate requirement
MPEG
MPEG Moving Pictures Experts Group
Coding of moving pictures and associated audio
Picture part
Can achieve compression ratio of about 50:1 through storing only
the difference between successive frames
Even higher compression ratios possible
Bit Rate
Defined in two ways
bits per second (all inter-frame compression algorithms)
bits per frame (most intra-frame compression algorithms
except DV and MJPEG)
Intra-frame compression
Spatial redundancy
Correlation/compression within a frame
Based on baseline JPEG compression standard
Inter-frame compression
Temporal redundancy
Correlation/compression between like frames
Audio compression
Three different layers (MP3)
Perceptual Redundancy
Here is an image represented with 8-bits per pixel
Perceptual Redundancy
The same image at 7-bits per pixel
Perceptual Redundancy
At 6-bits per pixel
Perceptual Redundancy
At 5-bits per pixel
Perceptual Redundancy
At 4-bits per pixel
Perceptual Redundancy
It is clear that we dont all these bits!
Our previous example illustrated the eyes sensitivity
to luminance
Fundamentals of JPEG
Encoder
DCT
Quantizer
Entropy coder
Compressed
image data
IDCT
Dequantizer
Entropy
decoder
Decoder
Fundamentals of JPEG
Anchor Frame
Motion Field
Camera zoom
General Considerations
for Motion Estimation
Two categories of approaches:
Feature based (more often used in object tracking, 3D
reconstruction from 2D)
Intensity based (based on constant intensity
assumption) (more often used for motion compensated
prediction, required in video coding, frame
interpolation)
Motion Representation
Global:
Entire motion
field is
represented by a
few global
parameters
Pixel-based:
One MV at each
pixel, with some
smoothness
constraint
between adjacent
MVs.
Block-based:
Entire frame is
divided into
blocks, and
motion in each
block is
characterized by
a few
parameters.
Region-based:
Entire frame is
divided into
regions, each
region
corresponding to
an object or subobject with
consistent
motion,
represented by a
Also mesh-based
anchor
frame
Predicted target
frame
Motion field
target frame
Examples
Predicted target
frame
Examples
mesh-based
method
EBMA
Examples
Motion Compensated
Prediction
Divide current frame, i, into disjoint 1616
macroblocks
Search a window in previous frame, i-1, for
closest match
Calculate the prediction error
For each of the four 88 blocks in the
macroblock, perform DCT-based coding
Transmit motion vector + entropy coded
prediction error (lossy coding)
MPEG Library
The MPEG Library is a C library for decoding MPEG-1
video streams and dithering them to a variety of color
schemes.
Most of the code in the library comes directly from an
old version of the Berkeley MPEG player (mpeg_play)
The Library can be downloaded from
http://starship.python.net/~gward/mpeglib/mpeg_lib-1.3.1.tar.gz
MPEGe Library
The MPEGe(ncoding) Library is designed to allow you to
create MPEG movies from your application
The library can be downloaded from the files section of
http://groups.yahoo.com/group/mpegelib/
The encoder library uses the Berkeley MPEG encoder
engine, which handles all the complexities of MPEG
streams
As was the case with the decoder, this library can write
only one MPEG movie at a time
The library works good with most of the common image
formats
To keep things simple, we will stick to PPM
Note: All functions return non NULL (i.e. TRUE) on success and Zero (or
FALSE) on failure.
Usage Details
You are not required to write code using the libraries to decode and encode
MPEG streams
Copy the binary executables from
http://www.csee.usf.edu/~mshreve/readframes
http://www.csee.usf.edu/~mshreve/encodeframes
Usage
To read frames from an MPEG movie (say test.mpg) and store them in a directory
extractframes (relative to your current working directory) with the prefix
testframe (to the filename)
readframes test.mpg extractframes/testframe
This will decode all the frames of test.mpg into the directory extractframes with
the filenames testframe0.ppm, testframe1.ppm
To encode,
encodeframes 0 60 extractframes/testframe testresult.mpg
In order to convert between PPM and PGM formats, copy the script from
http://www.csee.usf.edu/~mshreve/batchconvert