You are on page 1of 29

1.

1 Thut ton trch c trng MFCC thng thng


S khi thut ton trch c trng MFCC thng thng c ch ra
trong Hnh 1.1-1. N c t tn nh vy do lm cong vnh tn s phi tuyn theo
thang tn s Mel. Trc ht, ting ni c ly mu bi mt b chuyn i A/D.
Mc d tai ngi c th nghe c m thanh vi tn s t 20Hz ti 20KHz, nhng
ting ni thng thng ch khong tn s di 5KHz, v vi cht lng m
thanh thoi th s c bng thng gii hn l 4KHz. Vi l do ny, chng ti s
dng bng thng 4KHz trong ti ny v tn s ly mu y l 8KHz.
Pre-emphasis
Frame
Blocking
Windowing |FFT|
Mel
Frequency
Filter bank
Cepstrum
Logged
energy
Delta
Speech
samples
MFCC

Hnh 1.1-1 S khi thut ton trch c trng MFCC thng thng.
1.1.1 Pre-emphasis
Ting ni sau khi c s ha s c tin nhn (pre-emphasized) vi b
lc p ng xung hu hn (finite impulse response FIR) bc mt v pha ca n
tuyn tnh v thc thi n gin. Do trong ting ni, cc thnh t thp hn thng
cha ng nhiu nng lng hn, v vy n c xem xt hn khi m hnh ha so
vi cc thnh t cao hn. Do , mt b lc pre-emphasis c dng khuch
i tn hiu cc tn s cao hn. Hm truyn ca b lc c cho bi phng
trnh (Error! No text of specified style in document..1), trong tham s a tiu
biu t 0.9 n 1. Trong min thi gian, mi quan h gia ng ra vi ng vo
c ch ra trong phng trnh (Error! No text of specified style in
document..2), vi s
i
l im th i ca tn hiu ting ni khi cha qua b lc v
i
s'
l l im th i ca tn hiu ting ni sau khi c pre-emphasis.
()

(Error! No
text of
specified
style in
document..1)

(Error! No
text of
specified
style in
document..2)
Khi a = 0.97, l gi tr c dng trong h thng nhn dng ting ni bng
phn mm, p ng tn s ca b lc c ch ra trong Hnh 1.1-2, = (na
tn s ly mu) bin tng ln 35dB so vi = 0.

Hnh 1.1-2 p ng tn s ca b lc Pre-emphasis
1.1.2 Frame Blocking
Bi v tn hiu ting ni l tn hiu bin i chm theo thi gian, trong mt
h thng nhn dng ting ni th ting ni c phn on thnh nhng khong
thi gian ngn c gi l cc frame. cho cc thng s ca frame t thay i,
thng thng c 50% chng lp gia cc frame k cn nhau, nh c th thy
Hnh 1.1-3. Trong cc h thng nhn dng ting ni bng phn mm, ting ni
c chia thnh nhng frame c chiu di 20ms vi 10ms chng lp. Khi ting ni
c ly mu vi tn s 8KHz s c 160 mu trong mi frame v c 80 mu c
chng lp gia 2 frame k cn nhau.
Frame f
n
Frame f
n+1
Frame f
n+2
...
160 mu
80 mu

Hnh 1.1-3 Cch chia frame trong phn tch ting ni.
1.1.3 Windowing
Mt ca s thng c ng dng gia tng tnh lin tc gia cc frame
k cn nhau. Mt trong cc ca s c dng ph bin nht trong nhn dng ting
ni chnh l ca s Hamming c xc nh bi phng trnh (Error! No text
of specified style in document..3) trong L l chiu di ca s v n bng vi
chiu di ca cc frame. Hnh 1.1-4 cho thy p ng thi gian ca ca s
Hamming 160 im.
() [
( )

]
(Error! No
text of
specified
style in
document..3)


Hnh 1.1-4 Ca s Hamming 160 im.
Frame
Frame
Frame
...
160 mu
X

Hnh 1.1-5 Windowing trong phn tch ting ni.
Hnh 1.1-5 minh ha lm th no mt ca s Hamming c p dng ln
tn hiu ting ni trong phn tch ting ni. Ting ni sau khi c chia thnh
nhng frame c chiu di 160 mu vi 50% chng lp, 160 im ca s Hamming
c nhn vi mi frame theo tng mu. Cc frame ng ra ca s c lin tc ti
im u v im cui ca mi frame. Bc ny c th c din gii bi phng
trnh (Error! No text of specified style in document..4), trong ( ) l f
n
l frame
c pre-emphasis th n, ham(l) thay cho ca s Hamming ( ) l wf
n
l frame th n
sau khi qua ca s Hamming.

()

() () (Error! No
text of
specified
style in
document..4)

1.1.4 FFT
Bin i Fourier nhanh (FFT) c dng tnh ton ph ca tn hiu ting
ni. chnh l s thc thi php bin i Fourier ri rc (DFT) t hiu sut cao
vi iu kin rng buc l ph c nh gi ti nhng tn s ri rc, nhng tn
s ny c nhn vi N f
s
(cc tn s trc giao vi nhau), trong
s
f
l tn s
ly mu, N l chiu di ca DFT. Thut ton FFT ch yu cu khi lng tnh ton
t l vi NlogN, trong khi DFT yu cu khi lng tnh ton t l vi
2
N .
phn gii tn s ca DFT b gii hn bi 2 yu t: chiu di ca tn hiu
v chiu di ca DFT [14]. Nu tn hiu c to ra bi vic cng hai tn hiu sin
m tn s ca hai tn hiu ny rt gn vi nhau, khi phn bit hai tn s ny
chng ta phi quan st tn hiu vi phn on di. i vi chiu di ca DFT,
ph tn s c to ra bi N im DFT bao gm N/2 im vi cch u nhau
phn b gia 0 n phn na tn s ly mu. V vy tch ri hai tn s c
khong cch gn nhau th khong cch gia cc im phi nh hn khong cch
gia hai nh. Khi cc frame c ca s ha vi chiu di l 160 im, chiu di
DFT c thit lp l 256 im t c phn gii tn s tt vi khi lng
tnh ton c th chp nhn c khi thc thi thc t. Sau khi bin i FFT 256
im, ch c bin (cn bc 2) ca 128 im u tin c dng cho bc tnh
ton tip theo bi tnh cht i xng ca php bin i FFT.
1.1.5 Mel Frequency Bank
Mt bng b lc s c dng m hnh cc tng ban u ca phn
chuyn i trong h thng thnh gic con ngi vi 2 l do sau. Th nht, v tr
ca vic dch chuyn cc i dc theo mng rung trong tai ngi kch thch th
t l vi logarithm ca tn s m thanh. Th hai, cc tn s ca m thanh phc
hp bn trong mt bng tn xc nh ca mt vi tn s danh nh khng th c
nhn ra mt cch ring l c.
H thng thnh gic ca con ngi khng tuyn tnh vi tn s m thanh
nhn c, mt thang o Mel c dng nh x tn s m thanh nhn c
sang thang o tuyn tnh. Thang tn s ny c nh ngha bi phng trnh
(Error! No text of specified style in document..5) v c minh ha Hnh
1.1-6. N xp x nh thang tuyn tnh trong khong t 0 n 1000Hz, xp x nh
thang logarithm bn ngoi tn s 1000Hz.
() (

)
(Error! No
text of
specified
style in
document..5)

Hnh 1.1-6 Thang tn s Mel
Bng thng b lc thang o Mel thng thng trong nhn dng ting ni
bao gm mt s b lc bandpass hnh tam gic c phn b bn trong bng
thng tn hiu. Chng c cch u nhau trn thang Mel v bng thng ca
chng c thit k sao cho im 3dB nm khong gia hai b lc k cn nhau.
Hnh 1.1-7(a) v Hnh 1.1-7(b) cho thy cc b lc ny trn thang Mel v trn
thang tn s thng thng tng ng.

Hnh 1.1-7 Mt bng b lc Mel, theo thang Mel (a) v theo thang tn s thng thng
(b)
S b lc l mt trong nhng thng s m nh hng n chnh xc
nhn dng ca h thng.
H s cng sut th k ca frame th n c tnh ton bi phng trnh
(Error! No text of specified style in document..6), trong
nj
S l im ph th j
ca frame th n, v FC
kj
ch h s th j ca b lc th k.


(Error! No
text of
specified
style in
document..6)
y K l s b lc.
1.1.6 Cepstral Analysis
Tn hiu ting ni s c th c m t nh l kt qu ca php tch chp tn
hiu kch thch vi p ng xung ca b thanh m, n c th c chia thnh 2
phn thng qua cc phng trnh di y, vi g l tn hiu kch thch v v l p
ng xung ca b m thanh [15].
(Error! No
text of
specified
style in
document..7)
(Error! No
text of
specified
style in
document..8)
(Error! No
text of
specified
style in
document..9)
Phng trnh (Error! No text of specified style in document..7) ch ra mi
quan h gia g v v trong min thi gian, phng trnh (Error! No text of
specified style in document..8) ch ra mi quan h trong min tn s. Sau khi ly
logarithm 2 v, chng ta c phng trnh (Error! No text of specified style in
document..9), vi tn hiu kch thch v p ng xung ca b thanh m c tch
ri nhau. p ng ca b thanh m quyt nh ng bao ca ph, trong khi
ph ca tn hiu kch thch biu din cc thnh phn ph ca ting ni. i vi
nhn dng ting ni, ng bao ca ph hu ch hn cc thnh phn ph, v vy
chng ta c th s dng php bin i Fourier ngc tm ng bao ca ph.
Cepstrum c nh ngha l php bin i Fourier ngc ca cc h s
cng sut sau khi ly logarithm. N c th c n gin ha nh l php bin i
DCT.

[(

) (( )

]
(Error! No
text of
specified style
in
document..10)
trong p l bc (th t) ca cc h s cepstral. Thng thng, i vi mi frame
0
C
khng dng trong phn tch bi v n khng ng tin cy. Cc h s cepstrum
c bc thp phn nh thng tin b thanh m ca tn hiu ting ni. Trong php
phn tch ph cho vic nhn dng ting ni, thng thng ch s dng t 8 n 16
h s cepstrum c bc thp, trong a s cc ng dng dng 12 h s cepstrum.
1.1.7 Energy Calculation
Cng sut ca mi frame cng l thnh phn trong c trng MFCC. N
c tnh ton nh l logarithm ca cng sut tn hiu, c ngha l i vi frame
th n, mi frame c 160 mu
{ } 160 ..., , 2 , 1 , = l s
nl
,

)
(Error! No
text of
specified style
in
document..11)
Nng lng ny c tnh ton c lp trc khi pre-emphasis v
windowing trong cc h thng nhn dng ting ni bng phn mm.
1.1.8 Delta Coefficient
Cht lng ca h thng nhn dng ting ni c th c ci thin nhiu
hn bng cch thm vo tnh o hm theo thi gian c c nhng thng s
dng c bn. Trong x l tn hiu s, o hm cp 1 theo thi gian c th c
xp x bi

()

() () ( )
(Error! No
text of
specified style
in
document..12)

()

() ( ) ()
(Error! No
text of
specified style
in
document..13)
Phng trnh (Error! No text of specified style in document..12) cn
c gi l sai phn li, cn phng trnh (Error! No text of specified style in
document..13) cn c gi l sai phn tin. V vy, cc h s delta c th c
tnh ton bng cch s dng cng thc hi quy bn di, trong d
n
l vect h s
delta ca frame th n. tnh h s delta d
n
, dng cc vect h s dng t C
n2

n C
n+2
, vi C
n
l vect bao gm log nng lng v 12 h s cepstral ca frame
th n

) (


(Error! No
text of
specified style
in
document..14)

1.1.9 Kt lun
Sau qu trnh m t trn, mt frame 160 mu c chuyn i thnh
mt vector bao gm 26 phn t, trong gm 1 h s nng lng, 12 h s
cepstral v cc o hm bc nht theo thi gian ca chng. n frame c th to ra n
4 vector c trng bi v mt h s delta cn cc thng tin tnh t t frame

. Cc vector c trng ny c s dng trong qu trnh hun luyn v


nhn dng.















1.2 Thut ton trch c trng MFCC hiu chnh cho thc hin
phn cng
Thut ton trch c trng MFCC yu cu mt lng ln cc php ton v
hu ht cng sut tnh ton tiu th trong qu trnh bin i Fourier. Trong chng
ny, chng ti gii thiu mt thut ton trch c trng MFCC c hiu chnh.
Khi s dng phng php ngh, khi lng tnh ton c gim i mt na. S
khi ca thut ton mi c minh ha Hnh 1.2-1. Cc s khc bit chnh
gia thut ton thng thng v thut ton ci tin c nhn mnh bi cc khi
t nt.
Pre-emphasis Sub-Frame Windowing |FFT|
Mel
Frequency
Filter bank
Cepstrum
Logged
energy
Delta
Speech
samples
MFCC
Overlap

Hnh 1.2-1 S khi thut ton trch c trng MFCC hiu chnh.
1.2.1 Pre-emphasis
B lc pre-emphasis cng tng t nh b lc c dng trong thut ton
thng thng c trnh by trn. Trong phn mm nhn dng ting ni h s a
c thit lp bng 0.97, nhng thun tin cho thc thi phn cng, chng ti s
dng 32 31 = a . T Hnh 1.2-2 chng ta c th thy rng ch c mt sai khc nh
v p ng tn s ca 2 b lc, iu ny c ch ra bi cc kt qu th nghim
c trnh by phn sau.
Thun li ca vic dng h s b lc 32 31 = a c gii thch bi phng
trnh (Error! No text of specified style in document..15). Trong h thng tnh
ton s nh phn th
1
32
1
i
s c tnh bng cch dch phi
1 i
s i 5 bit. Bng cch
s dng tnh cht ny, php nhn ch n gin l php ton dch v tr, v vy c
thi gian tnh ton ln din tch chip c gim bt.
|
.
|

\
|
=
=
= = '

1 1
1
1
32
1

32
31

32
31
,
i i i
i i
i i i
s s s
s s
a as s s

(Error! No
text of
specified style
in
document..15)

Hnh 1.2-2 p ng tn s ca b lc pre-emphasis
1.2.2 Sub-Frame Blocking
Trong bc ny gii thiu mt thut ng mi tn l sub-frame. Mt
frame thng thng bao gm mt vi sub-frame m khng c chng lp gia cc
sub-frame k cn nhau, chng lp gia cc frame thng thng c th xem nh
vic dng li ca cng sub-frame. Nh tho lun trong phn 1.1.2, ting ni sau
khi c pre-emphasis c chia ra thnh nhng frame c chiu di 20ms vi
50% chng lp, chng ti xut chia tn hiu ting ni thnh nhng sub-frame c
chiu di 10ms, v vy 2 sub-frame k tip nhau to thnh mt frame thng
thng. Vi tn s ly mu l 8KHz, chiu di ca mi sub-frame by gi ch cn
80 im ( 80 01 . 0 8000 = ).
T Hnh 1.2-3, chng ta c th thy rng c 3 sub-frame to thnh 2 frame,
v vy n sub-frame s bao gm n-1 frame.

Frame f
n
Frame f
n+1
Frame f
n+2
Sub-frame
sf
n
...
...
160 mu
80 mu
(a)
(b)
Sub-frame
sf
n+1
Sub-frame
sf
n+2
Sub-frame
sf
n+3

Hnh 1.2-3 Sub-Frame v Frame
1.2.3 Windowing
Khi tn hiu ting ni c chia thnh nhng sub-frame c chiu di 80
im, trong gii thut mi ny, mt ca s Hamming 80 im c p dng cho
mi sub-frame gim i hiu ng bin ca mi phn on. Nh c cp n
trong phn 1.1.3, bp sng chnh v cc bp sng ph ca hm ca s nh hng
n vic phn tch ph ca cc tn hiu ting ni. Khi kch thc ca s gim i
mt na, b rng bp sng chnh s tng ln gp i tng ng. Hnh 1.2-4 ch ra
p ng tn s ca ca s Hamming 160 im v 80 im. Hnh 1.2-5 minh ha
phng thc to ca s ca gii thut c ngh.

Hnh 1.2-4 p ng tn s ca cc ca s Hamming khc nhau.

Hnh 1.2-5 Phng thc to ca s ca gii thut hiu chnh.
1.2.4 FFT
Trong bc ny s trnh by tnh ph ca mi sub-frame. Hnh 1.2-6 ch ra
mt sub-frame 80 im c ca s ha v ph ca n c chuyn i thnh
FFT 256 im v 128 im tng ng. phn gii tn s ca mt ph FFT
256 im ch tt hn mt t so vi mt ph FFT 128 im, nhng n yu cu
khi lng tnh ton gn nh l gp i bin i FFT 128 im v khi lng tnh
ton t l vi N Nlog vi N l s im FFT.
2
2
log
2
log
~
N N
N N

(Error! No
text of
specified style
in
document..16)


Hnh 1.2-6 Ph ca tn hiu t cc FFT khc nhau
Chng ta s dng FFT 128 im trong gii thut mi bi v nhng l do
trn. Hn th na, do tnh cht i xng ca mt ph nn ch c 64 im u
tin c dng cho cc bc tnh ton tip theo.
Sau khi bin i FFT, bin ca 64 im phc u tin c tnh ton
bng thut ton c lng, thut ton ny tnh ton rt nhanh bin ca mt s
phc gn nh chnh xc so vi cch tnh bin bng cch ly cn bc 2. Cho s
phc I+jQ, thut ton c lng bin nh sau:
{ } { } Q I Q I M , min , max + ~ | o
(Error! No
text of
specified style
in
document..17)
Php tnh gi tr tuyt i gii hn phm vi s phc trong tm t 0 n
0
90 ,
sau cc php tnh max, min s gii hn s phc trong tm t 0 n
0
45 . Trong
gii hn ca tm ny, s kt hp tuyn tnh ca I v Q t c xp x tt v bin
. Trong h thng ny, chng ti s dng 1 = o v 4 1 = | . Php tnh xp x ny
gim bt khi lng tnh ton vi sai s c th chp nhn c.
1.2.5 Mel Frequency Filter Bank
Chng ta s dng cng mt phng php nh tho lun trong phn 1.1.5,
nhng thay v s dng b lc hnh tam gic, trong gii thut mi ny chng ti s
dng b lc hnh ch nht. Theo nh nhng nghin cu trc y, mt bng b
lc c th dng cho nhn dng ting ni nu p ng tn s khi kt hp cc b lc
thnh phn ca n l phng trn ton b dy tn s mong mun. V vy bng b
lc hnh ch nht tha yu cu ny tt hn bng b lc hnh tam gic thng
thng. Hnh 1.2-7, Hnh 1.2-8 minh ha p ng xung ca bng b lc hnh tam
gic v hnh ch nht tng ng.

Hnh 1.2-7 Bng b lc hnh tam gic thng thng (a) v p ng tn s khi kt hp cc
b lc vi nhau (b).

Hnh 1.2-8 Bng b lc hnh ch nht ngh (a) v p ng tn s khi kt hp cc b
lc vi nhau (b).
(Bin ca tt c cc b lc hnh ch nht u bng 1 nhng d quan
st trong hnh trn ta v bin cc b lc khc 1)
Trong phng php thng thng, ng ra FFT c nhn vi cc h s ca
cc b lc hnh tam gic to ra cc gi tr ng ra b lc. Sau , tt c cc gi tr
ng ra ca mt b lc c ly tng to ra h s cng sut ca b lc . Tuy
nhin, nu chng ta s dng cc b lc hnh ch nht thay th cc b lc hnh
tam gic khi php ton nhn v cng n gin ch l cc php ton cng v
khng cng bi v h s ng ra ca b lc hnh ch nht hoc l 1 hoc l 0.
Trong thut ton mi th khng yu cu php ton nhn trong bc ny.
1.2.6 Overlapping
Qu trnh chng lp trong gii thut mi c minh ha Hnh 1.2-9.
Trong
n
f v
1 + n
f i din cho cc frame thng thng (chiu di mi frame l
160 im) th n v th n+1 vi 50% chng lp,
n
sf v
1 + n
sf i din cho cc sub-
frame trong gii thut mi (chiu di mi sub-frame l 80 im) th n v th n+1
tng ng.
nk
F S ' l h s cng sut c to ra bi b lc th k ca sub-frame th
n. Cc ng ra
nk
F S ' v
( )k n
F S
1 +
' ca bng b lc c cng vi nhau to ra h s
cng sut
nk
S' , h s cng sut
( )k n
S
1 +
' c c bng cch ly tng cc ng ra
( )k n
F S
1 +
' v
( )k n
F S
2 +
' ca b lc. iu c ngha l
nk
S' bng vi
nk
F S ' cng vi
( )k n
F S
1 +
' v
( )k n
S
1 +
' bng vi
k n
F S
1 ( +
' cng vi
( )k n
F S
2 +
' . Bng cch cng ng ra hin
ti vi ng ra trc ca bng b lc, chng ta lp li 50% chng lp ging nh
trong thut ton thng thng.

Hnh 1.2-9 Qu trnh chng lp trong thut ton ngh
Mt frame thng thng
n
f bao gm hai sub-frame l
n
sf v
1 + n
sf , nng
lng ca n bng vi tng nng lng ca hai sub-frame thnh phn, hoc trong
ton b bng thng (t 0 n 4 KHz) hoc trong mt khong bng thng c bit
no (b lc th k trong bng b lc). V vy, trong Hnh 1.2-9,
nk
S' ch ra h s
cng sut th k ca frame thng thng
n
f v
( )k n
S
1 +
' ch ra h s cng sut th k
ca frame thng thng
1 + n
f .
1.2.7 Cepstral Analysis
Tnh ton cc h s cepstrum cng tng t nh trong gii thut thng
thng
( ) ( ) P p
K
p
k S C
K
nk np
..., , 2 , 1 , 5 . 0 cos log
1 k
=
(

|
.
|

\
|
' =

=
t

(Error! No
text of
specified style
in
document..18)
Php ton logarithm th tnh ton rt kh trong c phn mm ln phn cng.
Bi v gi tr
nk
S' khng th bit trc v chng ta cn tnh logarithm nhanh v gn
ng, v vy n gin chng ti tnh xp x logarithm bi thut ton Mitchell
[19]. Gi N l s c cho bi:
i
k
i
i
z N

=
=
0
2
(Error! No
text of
specified style
in
document..19)
vi
i
z bng 0 hoc 1. Gi s
k
z bng 1 k t bit c trng s cao nht (MSB) ca z.
Khi N c vit li nh sau:

=
+ =
1
2 2
k
o i
i
i k
z N
(Error! No
text of
specified style
in
document..20)
Bng cch t h s
k
2 lm nhn t chung, ta c:
|
.
|

\
|
+ =

i
k
i
k i k
z N
1
0
2 1 2
(Error! No
text of
specified style
in
document..21)
Ly
2
log hai v, ta c:
|
.
|

\
|
+ + =

i
k
i
k i
z k N
1
0
2 2
2 1 log log
(Error! No
text of
specified style
in
document..22)
V k i < nn
i
k
i
k i
z

1
0
2 s thuc khong t 0 n 1, chng ta k hiu gi tr
ca tng ny bng m. Khi phng trnh (Error! No text of specified style in
document..22) c vit li nh sau:
( ) 1 , 1 log log
2 2
< + + = m m k N (Error! No
text of
specified style
in
document..23)
Mitchell xp x gi tr thc ca ( ) m + 1 log
2
bi phng trnh ng thng
b am+ . n gin, Mitchell s dng 1 = a v 0 = b trong phng trnh ng
thng xp x
m k N + ~
2
log (Error! No
text of
specified style
in
document..24)
V d, chng ta tnh gn ng 39 log
2
, c ngha l N=39. Biu din N di
dng s nh phn ta c N=100111. Trong trng hp ny, 5 = k v
( )
2
00111 . 0 = m . Chng ta vit li N di dng ( ) 39 00111 . 0 1 2
2
5
= + = N , bng
cch s dng phng trnh (Error! No text of specified style in document..24)
tnh xp x ( ) ( )
2 2 2
00111 . 101 00111 . 0 5 39 log = + = , gi tr xp x ny bng
5.21875, gi tr thc s ca 2854 . 5 39 log
2
= .
Sai s ca php ton gn ng l:
( ) m m error + = 1 log
2
(Error! No
text of
specified style
in
document..25)
Khi sai s ln nht bng 0.086071 vi m=0.442695 bng cch ta tnh
o hm cp 1 theo m nh sau:
( ) | |
( )
1
2 ln
1

1
2 ln 1
1
0
1 log
2
=

+
=
'
+ = '
m
m
m m r erro

(Error! No
text of
specified style
in
document..26)

1.2.8 Energy Calculation
Nng lng ca mi frame c tnh l logarithm ca tng nng lng 2
sub-frame k tip nhau (mi sub-frame c chiu di l 80 im). N c tnh bi
phng trnh (Error! No text of specified style in document..27), trong
nl
sf
chnh l mu th l trong sub-frame th n.
( )
|
.
|

\
|
+ =

= =
+
80
1
80
1
2
1
2
log log
l l
l n nl n
s s E
(Error! No
text of
specified style
in
document..27)
Trong thut ton mi ny th nng lng cng c tnh c lp trc khi
pre-emphasis v windowing nh trong thut ton thng thng. Php ton tnh
logarithm c tnh gn ng nh tho lun trong phn 1.1.7
1.2.9 Delta Coefficient
Trong php ton tnh h s delta, chng ti nhn biu thc ca phng
trnh (Error! No text of specified style in document..14) vi 10. Tt c cc h s
delta u c nhn vi 10, iu ny s khng nh hng n cht lng nhn
dng bi v hm phn phi khng i. Vic nhn cc h s delta vi 10 s lm cho
vic thc thi phn cng tr nn n gin hn nhiu v khng c php chia trong
tnh ton.
( ) ( )
( ) ( )
1 1 2 2
1 1 2 2
2
10
10
2
+ +
+ +
+ =

+
=
n n n n
n n n n
n
c c c c
c c c c
d

(Error! No
text of
specified style
in
document..28)
1.2.10 Nhng thun li ca gii thut hiu chnh
Phn cng thc thi b nhn tiu tn nhiu din tch v cng sut tiu th
hn cc phn cng thc thi cc php ton khc. V vy tit kim ngun ti
nguyn phn cng, chng ta cn gim bt s lng php ton nhn ti u ha
thut ton. Trong thut ton trch c trng MFCC thng thng c trnh by
trong phn 1.1, vi mt frame c chiu di 160 im cn phi tnh 160 php ton
nhn trong bc windowing, ( ) 256 log 128
2
php ton nhn trong vic tnh FFT,
khong 256 php nhn trong tnh ton cng sut v 12 27 php ton nhn trong
bc tnh DCT. Tng cng cn 1764 php nhn cho trch c trng mt frame.
Bng cch s dng thut ton trch c trng mi c ngh trong phn
1.2, tng s php nhn dng cho trch c trng mt frame ch cn 804, trong
bao gm 80 php ton nhn trong bc windowing, ( ) 128 log 64
2
php ton nhn
trong vic tnh FFT v 12 23 php ton nhn trong bc tnh DCT.
Cc c trng c trch bi phng php mi cha ng nhng thng tin
tng t nh phng php thng thng bi v chng cng c tnh ton t cc
h s cng sut ca frame l ng ra ca bng b lc tn s thang Mel ging nh
nhng c trng thng thng. Nhng s lng php ton nhn trong thut ton
mi c gim i mt na, phn ln php nhn c gim bt trong bc tnh
FFT. Hn na, php nhn trong tnh ton FFT th phc tp hn 4 ln php nhn
dng trong cc khi khc, bi v tnh FFT yu cu php nhn s phc. Kt qu l
khi thc thi phn cng th thut ton mi s dng t din tch v tiu th cng sut
t hn mc d chnh xc ca kt qu nhn dng ch thp hn mt t so vi thut
ton thng thng.
1.2.11 Kt lun
Bng cch s dng thut ton tnh trch c trng MFCC mi, mt frame
bao gm 2 sub-frame c chuyn thnh vect MFCC 26 phn t, gm 1 h s
nng lng, 12 h s cepstral v 13 h s delta (o hm cp 1 ca h s cepstral
theo thi gian). Vi n sub-frame, ta c n-1 frame thng thng v s vect c
trng l n-5.

You might also like