Professional Documents
Culture Documents
9/14/2006
9/14/2006
9/14/2006
9/14/2006
Related specifications
MUSICAM
ASPEC
NICAM 728
Dolby AC-3
9/14/2006
9/14/2006
9/14/2006
9/14/2006
9/14/2006
Vi ph u vo phc tp, v d nh m
nhc, ngng nghe u tng hu ht cc
tn s.
Kt qu l, m x x ca ct-xt audio
tng t ch c th nghe c khi nhc
im lng.
9/14/2006
10
9/14/2006
11
9/14/2006
Do s p ng chm ny,
hin tng che mt n vn c
th xy ra ngay c khi hai tn
hiu khng hin din ng
thi
Hin tng che mt n trc
v che mt n sau xut hin
khi m che mt n tip tc che
m thanh cc mc thp hn
trc v sau khong thi gian
din ra m che mt n .
12
9/14/2006
13
S che mt n
9/14/2006
14
Masking
9/14/2006
15
u vo
B lc
bng con
Phn
phi bit
Pht
Lung bit
u ra
Tnh ton
ngng
che mt n
9/14/2006
16
Sub-band
Filter
Bit
Allocation
Bit-stream
Generation
Output
Compute
Masking
9/14/2006
17
9/14/2006
18
9/14/2006
19
MPEG Audio: V d m ha
Bng
1 2
10
11
12
13
14
15
16
Mc (db)
0 8
12
10
10
60
35
20
15
9/14/2006
20
1 2
10
11
12
13
14
15
16
Level (db)
0 8
12
10
10
60
35
20
15
9/14/2006
21
9/14/2006
22
9/14/2006
23
MPEG Audio Lp I
9/14/2006
24
9/14/2006
25
9/14/2006
26
9/14/2006
27
Mu ng b v phn mo u.
9/14/2006
28
9/14/2006
29
MPEG b gii m lp I
9/14/2006
30
9/14/2006
31
H s nn
1:4
1:6...1:8
1:10...1:12
9/14/2006
32
1:6...1:8
1:10...1:12
9/14/2006
33
S dng b m ho Huffman.
9/14/2006
34
9/14/2006
35
32 subbands
0 to 31
Scaler
Quantizer
SMRn
Psychoacoustic
model
9/14/2006
Scale
factor
encoder
Bit-rate
allocation
Rn
Quantized
sample
encoder
Multiplexer
PCM
input
Analysis
filter bank
SFn
Output
Bit-rate
allocation
encoder
36
32 subbands
0 to 31
Scaler
Quantizer
SMRn
Psychoacoustic
model
9/14/2006
Scale
factor
encoder
Bit-rate
allocation
Rn
Quantized
sample
encoder
Multiplexer
PCM
input
Analysis
filter bank
SFn
Output
Bit-rate
allocation
encoder
37
9/14/2006
38
The input audio stream passes through a filter bank that divides
the input into multiple subbands of frequency.
The input audio stream simultaneously passes through a
psychoacoustic model that determines the ratio of the signal
energy to the masking threshold for each subband.
The bit- or noise allocation block uses the Signal-to-Mask
Ratios to decide how to apportion the total number of code
bits available for the quantization of the subband signals to
minimize the audibility of the quantization noise.
Finally, the multiplexer takes the representation of the quantized
subband samples and formats this data and side information into
a coded bitstream.
Ancillary data not necessarily related to the audio stream can
be inserted within the coded bitstream.
9/14/2006
39
Subband
filter 2
.
.
Subband
filter 32
12
samples
12
samples
12
samples
12
samples
12
samples
12
samples
12
samples
12
samples
12
samples
Layer I
frame
9/14/2006
Lp I: 12 * 32 = 384 mu.
Lp II, III: 12* 3* 32 = 1152 mu.
Nguyen Chan Hung - Faculty of Electronics & Telecommunications - HUT
40
Subband
filter 2
.
.
Subband
filter 32
12
samples
12
samples
12
samples
12
samples
12
samples
12
samples
12
samples
12
samples
12
samples
Layer I
frame
9/14/2006
41
PCM
input
Compute
quiet
threshold
Fast
Fourier
Transform
(FFT)
Tonal/
tonal
nontonal
separator
non
tonal
Compute
signal
power
Sn
Compute
tonal
masking
threshold
function
Compute
nontonal
masking
threshold
function
Masking
threshold
function
Calculate
Minimum
Mn
SMRn
B tch nhn dng v phn tch cc thnh phn m thanh dng sine
v cc m dng khng sine (ging nhiu) v kh nng che mt n ca
hai loi tn hiu ny khc nhau.
9/14/2006
42
PCM
input
Compute
quiet
threshold
Fast
Fourier
Transform
(FFT)
Tonal/
tonal
nontonal
separator
non
tonal
Compute
signal
power
Sn
Compute
tonal
masking
threshold
function
Compute
nontonal
masking
threshold
function
Masking
threshold
function
Calculate
Minimum
Mn
SMRn
The separator identifies and separates the tonal and noiselike components (non-tonal) of the audio signal because the
masking abilities of the two types of signal differ.
9/14/2006
43
Scaler
Quantizer
SMRn
Psychoacoustic
model
Buffer
fullness
Multiplexer
PCM
input
Quantized
sample
Huffman
encoder
Buffer
Analysis
filter bank
Scale
factor
encoder
Output
Side
information
encoder
Side
information
9/14/2006
44
Scaler
Quantizer
SMRn
Psychoacoustic
model
Buffer
fullness
Multiplexer
PCM
input
Quantized
sample
Huffman
encoder
Buffer
Analysis
filter bank
Scale
factor
encoder
Output
Side
information
encoder
Side
information
9/14/2006
45
Dng khung ca 3 lp
Layer I
Layer II
Layer III
Header
CRC
Bit Allocation
Scale factor
Samples
Ancillary
(32)
(0,16)
(128,256)
(0-384)
Header
CRC
Bit Allocation
SCFSI
Scale factor
(32)
(0,16)
(128,256)
(0-60)
(0-384)
Header
CRC
Side information
Main Data
Ancillary
(32)
(0,16)
(136, 256)
data
data
Samples
Ancillary
data
9/14/2006
46
Layer II
Layer III
Header
CRC
Bit Allocation
Scale factor
Samples
Ancillary
(32)
(0,16)
(128,256)
(0-384)
Header
CRC
Bit Allocation
SCFSI
Scale factor
(32)
(0,16)
(128,256)
(0-60)
(0-384)
Header
CRC
Side information
Main Data
Ancillary
(32)
(0,16)
(136, 256)
data
data
Samples
Ancillary
data
9/14/2006
47
Khung mp3
9/14/2006
48
MP3 frame
The main data section contains the coded scale factor values
and the Huffman coded frequency lines
Its length depends on the bitrate and the length of the ancillary
data.
The length of the scale factor part depends on whether scale
factors are reused, and also on the window length (short or long).
The scale factors are used in the requantization of the
samples
The demand for Huffman code bits varies with time during the
coding process.
The variable bitrate format can be used to handle this, but a fixed
bitrate is often required for an application such as broadcasting
Therefore there is also a bit reservoir technique that allows
unused main data storage in one frame to be used by up to
two consecutive frames
9/14/2006
49
9/14/2006
50
The design of the Layer III bitstream better fits the encoder's time
varying demand on code bits.
As with Layer II, Layer III processes the audio data in frames of
1,152 samples.
Unlike Layer II, the coded data representing these samples do not
necessarily fit into a fixed length frame in the code bitstream.
The encoder can donate bits to a reservoir when it needs fewer
than the average number of bits to code a frame.
9/14/2006
51
9/14/2006
52
9/14/2006
53
Mc ch
Tng phn gii tn strong cc bng con c m
ho nhn thc tt hn.
Cho php gim bt nhiu rng ca gy ra bi cc b lc
bng con.
MDCT (Modified Discrete Cosine Transform) - Bin i
cosin ri rc ci tin.
50% bin i gi nhau
Ca s MDCT ngn : 6 bng con ph (12 im DCT)
trong mi bng con. Phn gii thi gian tt hn.
Ca s MDCT di: 18 bng con ph(36 im DCT)
trong mi bng con. Phn gii tn s tt hn.
9/14/2006
54
Purpose
9/14/2006
55
B gii m mp3
9/14/2006
56
MP3 Decoder
9/14/2006
57
c tnh MP3
Sound quality
Bandwidth
Mode
Bitrate
Reduction ratio
Telephone sound
2.5 kHz
mono
8 kbps *
96:1
Short wave
4.5 kHz
mono
16 kbps
48:1
AM radio
7.5 kHz
mono
32 kbps
24:1
FM radio
11 kHz
stereo
56...64 kbps
26...24:1
Near-CD
15 kHz
stereo
96 kbps
16:1
CD
>15 kHz
stereo
112..128kbps
14..12:1
9/14/2006
58
MP3 Performance
Sound quality
Bandwidth
Mode
Bitrate
Reduction ratio
Telephone sound
2.5 kHz
mono
8 kbps *
96:1
Short wave
4.5 kHz
mono
16 kbps
48:1
AM radio
7.5 kHz
mono
32 kbps
24:1
FM radio
11 kHz
stereo
56...64 kbps
26...24:1
Near-CD
15 kHz
stereo
96 kbps
16:1
CD
>15 kHz
stereo
112..128kbps
14..12:1
9/14/2006
59
MPEG-2 Audio
9/14/2006
60
MPEG-2 Audio
9/14/2006
61
Knh L v R c m ho nh MPEG1.
Cc knh b sung c m ho nh d
liu ph thuc trong lung audio MPEG-1.
9/14/2006
62
9/14/2006
63
CRC
Bit Allocation
SCFSI
Scale factor
Samples
MC
MC
MC
MC
MC
Header
CRC
Bit Allocation
SCFSI
Predictor
Ancillary data 1
MC Samples
Ancillary data 2
Multi-lingual
Commentary
9/14/2006
64
CRC
Bit Allocation
SCFSI
Scale factor
Samples
MC
MC
MC
MC
MC
Header
CRC
Bit Allocation
SCFSI
Predictor
Ancillary data 1
MC Samples
Ancillary data 2
Multi-lingual
Commentary
9/14/2006
65
Layer
LayerIIII
Low
Low
Frequency
Frequency
Layer
LayerIIIIII
MultiMultiChannel
Channel
9/14/2006
Layer
LayerIIII
5 channels
32, 44.1, 48 Khz
Layer
LayerIIIIII
Layer
LayerI I
Layer
LayerIIII
Layer
LayerIIIIII
66
Layer
LayerIIII
Low
Low
Frequency
Frequency
Layer
LayerIIIIII
MultiMultiChannel
Channel
9/14/2006
Layer
LayerIIII
5 channels
32, 44.1, 48 Khz
Layer
LayerIIIIII
Layer
LayerI I
Layer
LayerIIII
Layer
LayerIIIIII
67
9/14/2006
68
9/14/2006
69
9/14/2006
70
Key Points
MPEG-2 Audio BC
MPEG-2 AAC (NBC)
9/14/2006
71