Professional Documents
Culture Documents
n
. Gi tr d on
ca mu tip theo c c nh ngoi suy t p gi tr mu cho trc:
1
'( ) '
p
i n i
i
x n a x
=
=
(2.1)
a
i
l h s ca cc b d on, chnh lch gia xung ly mu u vo v tn hiu ra ly
mu l:
'( )
n n
e x x n = (2.2)
y chnh l gi tr dng lng t ho v truyn i, pha thu s tin hnh hi phc li
tn hiu sai s ny v tch phn li cng vi tn hiu hi phc trc , tuy nhin gim
li cng li ca nhiu ln ta dng phia thu mt b d on ging vi pha pht. Vic s dng
vng phn hi gip cho b lng t hn ch chnh lch gia sai s e
n
v s
i
s c lng
t e
n
(e
n
-e
n
). Nu gi tr ny cng nh th cht lng ting ni cng tt, theo cc tnh ton th
phng php ny c rng bng tn i mt na.
2.6. iu ch Delta (DM)
iu ch DM l mt loi iu ch DPCM trong mi t m ch c mt bt nh phn, c
u im mch in d dng ch to ( hnh di ). Tn hiu thoi sau khi c lc bng tn
0,3-3,4Khz c ri rc ho to thnh tn hiu PAM x
n
, so snh tn hiu ny vi tn hiu d
on x
n
, lch gia hai gi tr ny (e
n
) c lng t thnh mt trong hai gi tr -, hoc
+. Pha ra b lng t ho s truyn i mt bit nh phn cho mi xung ly mu. Ti pha thu
cc gi tr c cng vi cc gi tr d on tc thi pha ra b gii m khi phc li ting
Chng 2: Biu din s ca tn hiu ting ni
20
ni ban u. Tc bit ca iu ch delta bng tc ca tn s ly mu, tc l 8 kbps.
Phng php ny nh ni l kh n gin, t c tc m ho rt thp, n l phng
php duy nht ca phng php m ho dng sng c th so snh v tc vi phng php
tham s ngun v tc , song cht lng tn hiu m ho khng cao, khng m bo c
phm vi ng ca h thng PCM.
2.7. iu ch Delta thch nghi (ADM)
Phng php ny cn gi l phng php iu ch delta c dc thay i lin tc.
Phng php ny khc phc cho iu ch delta v kh nng di ng, phng php ny da
trn phng php thay i ng h s khuych i ca b tch phn ph hp vi mc cng
sut trung bnh ca tn hiu vo.
Hnh 2.11 S m ho v gii m Delta
Hnh 2.12 Dng sng tn hiu ca iu ch DM
Chng 2: Biu din s ca tn hiu ting ni
21
C ca bc lng t thay i nh thay i h s khuych i ca b tch phn nh mch
RC v mch bnh phng, khi tn hiu vo l hng s hoc thay i chm theo thi gian th b
iu ch ny s tm kim v a ra mt dy xung c cc tnh xen k, mch RC ly trung bnh
cc dy ny, khi n a ra ga tr bng zero. C ngha l tn hiu iu khin lm h s
khuych i ca b khuych i thay i rt t. u ra b khuych i c bc kch thc
nh, khi tn hiu vo c sn dc th hm bc thang c to ra kp dc ca tn hiu vo.
Lc s to ra mt lot xung m mch RC ly trung bnh lot xung ny v a ra in p
iu khin ln, tc l c ca bc tng ln, nh mch bnh phng nn in iu khin b
khuych i lun lun dng, m khng ph thuc cc tnh ca xung th no phng php
ny c kh nng gim mo do qu ti sn v tp m ht.
Hnh 2.13 Dng sng tn hiu trong ADM
Hnh 2.14 S m ho v gii m ADM
Chng 2: Biu din s ca tn hiu ting ni
22
2.8. iu ch xung m vi sai thch nghi (ADPCM)
y l phng php m ho kh quan trng, tp hp c nhng u im ca cc phng
php trn v c ITU-T tiu chun ho trong khuyn ngh G721, v c nhiu ng dng
trong thc t nh h thng di ng CT2 ca Hn Quc, DECT ca M. V vy ta s nghin
cu su phng php. Cc tc c tiu chun l 40, 32, 24, 26 kbps. Phng php ny
da trn tnh cht thay i chm ca phng sai v hm t tng quan, vi phng php
PCM ta dng b lng t u c cng sut tp m l
2
/12, phng php ADPCM v cc
phng php d on tuyn tnh ni chung l thay i hay cn gi l phng php dng b
lng t ho t thch nghi. Cc thut ton c pht trin cho h thng iu xung m vi sai
khi khi m ho tn hiu ting ni bng cch s dng b lng t ho v b d on thch nghi,
c thng s thay i theo chu k phn nh tnh thng k ca tn hiu ting ni.
Hnh 2.15 S m ho ADPCM
Hnh 2.16 S gii m ADPCM
2.9. Bi thc hnh cc phng php biu din s tn hiu ting ni
S dng my tnh c nhn v phn mm Matlab (hoc cc ngn ng lp trnh khc) thc
hin cc cng vic sau:
Ghi m mt on tn hiu ting ni bt k. Lu tp nh dng th (*.wav).
S dng Matlab hoc cc ngn ng lp trnh khc c v hin th tn hiu theo dng sng
min thi gian.
Chng 2: Biu din s ca tn hiu ting ni
23
Biu din ph ca mt phn on tn hiu vi cc dng hm ca s khc nhau.
S dng mt trong cc phng php bin i hc trong chng ny cho on tn hiu.
Kt qu thu c c kim tra theo cc tiu ch: dung lng tp, cht lng m thanh cm
th,
Chng 3: Phn tch ting ni
24
Chng 3: Phn tch ting ni
3.1. M u
Trong chng ny chng ta s xem xt cc phng php phn tch tn hiu ting ni. Phn
tch ting ni thc hin gii quyt cc vn tm ra mt dng thc ti u biu din c ting
ni mt cc hiu qu. N l c s cho vic pht trin cc k thut, cng ngh tng hp, nhn
dng v nng cao cht lng tn hiu ting ni. Phn tch ting ni thng thc hin vic trch
chn hoc chuyn i tn hiu ting ni sang mt dng thc biu din khc sao cho c th
biu din thng tin ting ni tt hn theo cch m chng ta cn. Mt cch tng qut, hu ht
cc phng php phn tch tn hiu ting ni tp trung vo mt trong ba vn chnh. Th
nht l tm cch loi b nh hng ca pha, thnh phn khng ng vai trong quan trng
trong vic truyn ti thng tin ting ni. Th hai, thc hin vic chia tch ngun m v mch
lc (m hnh tuyn m) sao cho chng ta c th nghin cu bin ph ca tn hiu mt cch
c lp. Cui cng l chuyn i tn hiu hoc bin ph tn hiu sang mt dng biu din khc
hiu qu hn.
3.2. M hnh phn tch ting ni
M hnh tng qut cho vic phn tch ting ni c trnh by trong hnh 3.1. Cc dng tn
hiu ti cc bc cng c trnh by km theo trong minh ha.
Tn hiu ting ni c tin x l bng cch cho qua mt b lc thng thp vi tn s ct
khong 8kHz. Tn hiu thu c sau c thc hin qu trnh bin i sang dng tn hiu
s nh b bin i ADC. Thng thng, tn s ly mu bng 16kHz vi tc bt lng t
ha l 16bit.
Tn hiu ting ni dng s c phn khung vi chiu di khung thng khong 30ms v
khong lch cc khung thng bng 10ms. Khung phn tch tn hiu sau c chnh bin
bng cch ly ca s vi cc hm ca s ph bin nh Hamming, Hanning.... Tn hiu thu
c sau khi ly ca s c a vo phn tch vi cc phng php phn tch ph (chng
hn nh STFT, LPC,...). Hoc sau khi phn tch ph c bn, tip tc c a n cc khi
trch chn cc c trng.
3.3. Phn tch ting ni ngn hn
Trong l thuyt phn tch, chng ta thng khng n mt im quan trng l cc
phn tch phi c tin hnh trong mt khong thi gian gii hn. Chng hn, chng ta bit
rng bin i Fourier theo thi gian lin tc l mt cng c v cng hu ch cho vic phn
tch tn hiu. Tuy nhin, n yu cu phi bit c tn hiu trong mi khong thi gian. Hn
na, cc tnh cht hay c trng ca tn hiu m chng ta cn tm hiu phi l cc i lng
khng i theo thi gian. iu ny trong thc t phn tch tn hiu kh m t c v vic
phn tch tn hiu p ng cc ng dng thc t c thi gian hu hn. Hu ht cc tn hiu,
c bit l tn hiu ting ni, khng phi l tn hiu khng i theo thi gian.
Chng 3: Phn tch ting ni
25
Hnh 3.1 M hnh tng qut ca vic x l tn hiu ting ni
V mt nguyn l, chng ta c th p dng cc k thut phn tch bit vo phn tch tn
hiu trong ngn hn. Tuy nhin v tn hiu ting ni l mt qu trnh mang thng tin ng nn
chng ta khng th ch n thun xem xt phn tch ngn hn trong ch mt khung thi gian
n l.
Tn hiu ting ni nh cp l tn hiu thay i theo thi gian. N c cc c trng c
bn nh ngun kch thch (excitation), cng (pitch), bin (amplitude), ... Cc tham s
thay i theo thi gian ca tn hiu ting ni c th k n l tn s c bn (fundamental
frequency - pitch), loi m (m hu thanh - voiced, v thanh - unvoiced, tc - fricative hay
khong lng - silence), cc tn s cng hng chnh (formant), hm din tch ca tuyn m
(vocal tract area), ...
Vic thc hin phn tch ngn hn tc l xem xt tn hiu trong mt khong nh thi gian
xung quanh thi im ang xt n no . Cc khong ny thng khong t 10-30ms. iu
ny cho php chng ta gi thit rng trong khong thi gian cc tnh cht ca dng sng tn
hiu ting ni l tng i n nh. Khong nh tn hiu dng phn tch thng c gi
l mt khung (frame), hay mt on (segment). Mt khung tn hiu c xc nh l tch ca
mt hm ca s dch w(m) v dy tn hiu s(n):
Chng 3: Phn tch ting ni
26
( ) ( ) ( )
n
s m s m w n m = (3.1)
Mt khung tn hiu c th c hiu nh mt on tn hiu c ct gt bi mt hm ca
s to thnh mt dy mi m cc gi tr ca n bng khng bn ngoi khong n[m-
N+1,m]. T cng thc (3.1) chng ta thy rng khung tn hiu ny ph thuc vo khong thi
gian kt thc m. Trong khung tn hiu nh va c nh ngha, d dng thy rng cc php
x l ngn hn cng c ngha tng ng cc php x l di hn.
Nh cp, vic phn tch tn hiu ting ni khng th n gin ch bng phn tch mt
khung tn hiu n l m phi bng cc phn tch ca cc khung tn hiu lin tip. Thc t,
trnh mt thng tin, cc khung tn hiu thng c ly bao trm nhau. Ni mt cc khc, hai
khung cnh nhau c chung t nht M>0 mu. Hnh 3.2 minh ha vic phn chia khung vi
hm ca s.
Hnh 3.2 Phn tch tn hiu trn cc khung bao trm nhau
Mt php phn tch ngn hn tng qut c th biu din l:
( ) ( ) ( ) {s w }
n
m
X m T m n m
=
=
(3.2)
trong , X
n
biu din tham s phn tch (hoc vc-t cc tham s phn tch) ti thi im
phn tch n. Ton t T{} nh ngha mt hm phn tch ngn hn. Tng (3.2) c tnh vi
gii hn v cng c hiu l php ly tng c thc hin vi tt c cc thnh phn khc
khng ca khung tn hiu l kt qu ca php ly ca s. Ni cch khc, tng c thc hin
vi mi gi tr ca m trong tp xc nh (support) ca hm ca s.
Mt s hm ca s ph bin thng hay c s dng l: hm ca s ch nht
(rectangular window), hm ca s Hanning, v hm ca s Hamming.
3.4. Phn tch ting ni trong min thi gian
Vic phn tch ting ni trong min thi gian tc l phn tch trc tip trn dng sng tn
hiu sau khi thc hin vic ly ca s trong min thi gian. Nh cp trong phn trc,
chng ta ch xem xt cc phn tch ngn hn ca tn hiu. V vy, n gin trong trnh by
chng ta mc nh cc cng thc xy dng l cc phn tch ngn hn. Trong trng hp nu
cc phn tch khng phi l ngn hn th chng s c ch thch r rng.
Chng 3: Phn tch ting ni
27
a) Nng lng trung bnh
Tham s u tin chng ta cn quan tm trong phn tch tn hiu ting ni trong min thi
gian l nng lng trung bnh. Nng lng trung bnh ca tn hiu ting ni c xc nh
nh sau:
( ) ( ) ( ) ( ) ( )
2 2
w
n n
m m
E s m s m n m
= =
= =
(3.3)
Vic xc nh nng lng trung bnh ca tn hiu rt hu ch trong vic c lng cc tnh
cht ca cc hm kch thch trong m hnh m phng b my pht m hay cc m hnh tng
hp tn hiu ting ni. Ngoi ra, n cung cp cho chng ta mt cng c hu ch pht hin
mt tn hiu m l ca m hu thanh, v thanh hay mt khong lng. iu ny l bi v bin
tn hiu m v thanh thng rt nh hn so vi bin tn hiu m hu thanh.
Cn ch rng di ca s phn tch phi c chn thch hp. N phi di s
thay i ca nng lng tn hiu trong mt khung c th c lm mn. Tuy nhin cng
khng c qu di dn n lut thay i nng lng tn hiu t mt on ny sang mt on
tn hiu khc b hiu lm.
Mt nhc im ca vic s dng nng lng trung bnh ca tn hiu l vi cc mc tn
hiu ln, chng c xu th lm lch mt cch ng k gi tr c lng nng lng ton khung.
b) ln bin trung bnh
Nh cp trong phn trn, nng lng trung bnh tn hiu kh nhy cm vi ln
ca tn hiu. Do , ngi ta thng hay s dng mt i lng thay th l ln bin
trung bnh, c xc nh bi:
( ) ( ) | | w
n
m
M s m n m
=
=
(3.4)
c) Tc tr v khng
Mt tham s khc cng thng c quan tm trong cc php phn tch tn hiu ting ni
trong min thi gian l tc tr v khng (zero-crossing rate). S kin tr v khng xy
ra khi tn dng sng tn hiu ct trc honh hay ni cch khc khi cc mu lin tc nhau c
du khc nhau. V mt ton hc, tc tr v khng c xc nh nh sau:
( ) ( ) ( ) 0, 5 sgn{s } sgn{s 1 } w
n
m
Z m m n m
=
=
(3.5)
Trong hm sgn(a) l hm du: bng 1 nu a0; bng -1 nu a<0. D thy 0,5|sgn{s(m)}-
sgn{s(m-1)}| bng 1 nu s(m) v s(m-1) khc du nhau v bng 0 nu chng cng du. iu
ny ngha l Z
n
l tng trng s ca tt c cc thay i du ca cc mu trong vng xc nh
(support) ca ca s dch w(n-m). Tc tr v khng c th xem nh l mt o lng ca
tn s. Mc d tc tr v khng thay i kh ln theo thi gian v loi tn hiu, nhng n
biu hin s khc bit r rt vi tn hiu m v thanh v hu thanh. Cc tn hiu m hu thanh
c s suy gim ln vng tn cao do c tnh t nhin thng thp ca cc xung dy thanh
(glottal pulse), trong khi cc tn hiu m v thanh c nng lng ln vng tn cao. Do vy,
cng nh i lng nng lng trung bnh tn hiu, tc tr v khng cng l cc tham s
quan trng pht hin xem mt tn hiu l tn hiu ca m v thanh, hu thanh hay khong
lng.
Chng 3: Phn tch ting ni
28
d) Hm t tng quan
Hm t tng quan thng c s dng nh mt cng c xc nh tnh chu k ca tn
hiu v n cng l c s cho nhiu phng php phn tch ph khc. Hm t tng quan
c nh ngha tng t nh hm t tng quan thng thng:
( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( )
w w
w
n n n
m
m
n
m
k s m s m k
s m n m s m k n k m
s m s n m n m
=
= +
= +
=
(3.6)
Trong cng thc (3.6) chng ta s dng tnh cht ca hm t tng quan l mt hm
chn, i xng v ( ) ( ) ( ) w w w
k
m m m k = + .
Cng tng t nh hm t tng quan tn hiu chng ta bit, c mt mi quan h gia
hm t tng quan v nng lng trung bnh tn hiu nh sau:
( ) ( ) ( ) ( )
2
w 0
n n
m
E s m n m
=
= =
(3.7)
e) Hm vi phn bin trung bnh
Hm vi phn bin trung bnh c nh ngha nh sau:
( ) ( ) ( ) | | w
n
m
M s m s m n m
=
=
(3.8)
Cng thc (3.8) cho thy gi tr hm vi phn bin trung bnh, vi tham s v s khc
nhau v thi gian s rt nh khi tin n chu k (nu c) ca tn hiu s(n). Do hm vi
phn bin trung bnh l mt trong cc cng c hu ch cho vic xc nh tn s c bn ca
tn hiu ting ni.
3.5. Phn tch ting ni trong min tn s
3.5.1 Cu trc ph ca tn hiu ting ni
Trong phn tch tn hiu ting ni, thay v s dng trc tip tn hiu ting ni trong min
thi gian, ngi ta thng hay s dng cc c trng ph ca ting ni. iu ny xut pht t
quan im rng tn hiu ting ni cng ging nh cc tn hiu xc nh khc c th xem nh
l tng ca cc tn hiu hnh sin vi bin v pha thay i chm. Hn na, mt nguyn nhn
quan trng khng km l vic cm nhn ting ni ca con ngi lin quan trc tip n
thng tin ph ca tn hiu ting ni nhiu hn trong khi cc thng tin v pha ca tn hiu ting
ni khng c vai tr quyt nh.
Ph bin phc ca tn hiu ting ni c nh ngha l bin i Fourier (FT) ca
khung tn hiu vi khong thi gian phn tch n c nh:
( ) ( ) ( ) w
j j m
n
m
S e s m n m e
=
=
(3.9)
Biu thc (3.9) c th vit li nh sau:
Chng 3: Phn tch ting ni
29
( ) ( ) ( ) ( ) *w |
j j n
n n n
S e s n e n
=
=
(3.10)
Biu thc (3.10) c gi l mt cch din dch php bin i Fourier ri rc theo kha
cnh mch lc. Tn hiu iu bin $s(\tilde{n})e^{-j\omega \tilde{n}}$ dch ph ca
$s(\tilde{n})$ xung ln v kt qu thu c s c la chn bi mt b lc ca s thng
di vi tn s trung tm bng khng.
Mt khc cng thc (3.9) cng c th vit l:
( ) ( ) ( ) ( ) ( )
* w * |
j j n j n
n n n
S e s n n e e
=
=
(3.11)
Cng thc (3.11) c th din gii nh sau. Tn hiu ( ) s n c a qua b lc thng di c
tn s trung tm v p ng xung ( ) w
j n
n e
. Kt qu thu c c dch tn xung bng
cch iu ch bin vi
j n
e
to ra tn hiu bng tn thp.
Hnh 3.3 minh ha mt khung tn hiu v ph tng ng.
Mt ph cng sut trong mt khong thi gian ngn, tc l ph ngn hn ca tn hiu
ting ni, c th c xem nh l tch ca hai thnh phn: thnh phn th nht l ng bin
ph thay i mt cch chm chp theo tn s; thnh phn th hai l cu trc ph mn (spectral
fine structure) thay i rt nhanh theo tn s. i vi cc m hu thanh th cu trc ph mn
to thnh cc mu tun hon, cn i vi cc m v thanh th khng. Bin ph, hay cng
chnh l c trng ph tng qut (overall), m t khng ch cc c tnh (characteristics) cng
hng v phn cng hng (anti-resonance) ca cc c quan pht m (articulatory organs) m
cn m t cc c trng tng qut ca pht x (radiation) v ph ngun glottal mi v
khoang mi. Trong khi , cu trc ph mn m t tnh tun hon ca ngun m.
Cng thc (3.9) l mt hm ca tn s phn tch lin tc . Do FT tr thnh mt
cng c hu ch trong cc phn tch thc t chng ta cn tnh ton n vi tp tn s ri rc v
hm ca s c b rng hu hn vi mi bc dch chuyn R>1. Khi chng ta c:
( ) ( ) ( ) ( )
2
1
w 0,1,..., 1
k
rR
j m
N
rR
m rR L
S k s m rR m e k N
= +
= =
(3.12)
N l s cc tn s cch u nhau trong khong 0 2, L l di hm ca s (o
lng bng s mu). V chng ta gi thit hm ca s w(n) l hm c tnh nhn qu v c gi
tr khc khng ch trong khong 0 m L-1 do phn tn hiu ly qua ca s s(m)w(rR-m)
s c gi tr khc khng trn khong rR-L+1 m rR.
Chng 3: Phn tch ting ni
30
Hnh 3.3 Khung tn hiu v ph tng ng
3.5.2 Spectrogram
Spectrogram l mt trong nhng cng c c bn ca phn tch ph tn hiu ting ni, trong
n chuyn i dng sng tn hiu ting ni hai chiu thanh cu trc ba chiu (bin /tn
s/thi gian). Trong spectrogram, thi gian v tn s tng ng l cc trc ngang v dc,
cn bin c biu din bi m nht. Cc nh ca ph tn hiu xut hin l cc di
nm ngang mu m. Tn s trung tm ca cc di thng c coi l cc formant. Cc m
hu thanh to ra cc mng dc trong biu spectrogram bi v c mt s tng cng bin
tn hiu ting ni mi khi thanh qun ng li. Nhiu trong cc m v thanh to ra cc cu
trc m hnh ch nht v kt thc ngu nhin vi nhiu m nht do s thay i tc th ca
nng lng tn hiu. Lc spectrogram ch din t bin ph ca tn hiu m b qua cc
Chng 3: Phn tch ting ni
31
thng tin v pha bi v cc thng tin v pha c cho rng khng c vai tr quan trng trong
hu ht cc ng dng lin quan n ting ni.
xy dng lc spectrogram, ngi ta thc hin vic biu din bin ca bin i
Fourier ngn hn (STFT) |S
n
(e
j
)| theo thi gian trn trc nm ngang, ng thi theo tn s
(t 0 n ) trn trc thng ng (tc l t 0 n F
s
/2, vi F
s
l tn s ly mu), ng thi
ln bin bng m nht (thng theo thang t l l-ga-rt)
( ) ( )
10
, 20log | |
r k rR
n
S t f S k =
(3.13)
Trong t
r
=rRT v f
k
=k/(NT) v T l chu k ly mu ca tn hiu. Hnh 3.4 minh ha
spectrogram ca tn hiu ting ni cng vi dng sng tn hiu tng ng.
Hnh 3.4 Lc spectrogram ca tn hiu ting ni "Should we chase"
Hai lc spectrogram c xy dng vi cc hm ca s c di khc nhau.Lc
spectrogram pha trn l k qu khi s dng ca s c chiu di 101 mu tng ng vi 10ms.
Chiu di ca ca s phn tch ny xp x bng chu k ca dng sng trong cc khong tn
hiu m hu thanh. Kt qu l trong cc khong tn hiu m hu thanh, spectrogram biu hin
cc vn nh hng thng ng tng ng vi thc t rng ca s trt lc gom hu ht cc
mu c bin ln, lc gom hu ht cc mu c bin nh. Ni mt cch khc, khi ca s
phn tch c di ngn, mi chu k pitch ring r c hin th r nt theo thi gian, trong
khi phn gii theo tn s th rt km. Cng chnh v l do ny, nu chiu di ca s phn
tch m ngn, th lc spectrogram thu c gi l lc spectrogram bng rng. Ngc
li, nu chiu di ca s phn tch ln, th lc spectrogram thu c gi l lc
spectrogram bng hp. Lc spectrogram bng hp c phn gii theo tn s cao nhng
theo thi gian th nh. Minh ha pha di ca hnh 3.4 l kt qu ca vic s dng ca s
phn tch c di 401 mu, tng ng vi 40ms, bng khong vi chu k tn hiu. V nh
Chng 3: Phn tch ting ni
32
chng ta thy, lc spectrogram tng ng khng cn nhy vi s thay i v thi gian
na.
3.6. Phng php phn tch m ha d on tuyn tnh (LPC)
Phng php phn tch d on tuyn tnh l mt trong cc phng php phn tch tn hiu
ting ni mnh nht v c s dng ph bin. im quan trng ca phng php ny nm
kh nng n c th cung cp cc c lng chnh xc ca cc tham s tn hiu ting ni v
kh nng thc hin tnh ton tng i nhanh.
M hnh ca phng php phn tch tn hiu ting ni da trn m d on tuyn tnh
(LPC- Linear Predictive Coding) c trnh by trong hnh v 3.5. Phng php phn tch
LPC thc hin vic phn tch ph trn cc khung (khi - block) tn hiu hay cn gi l cc
khung tn hiu (speech frames) bng vic s dng mt m hnh ha ton im cc. iu ny
c ngha l kt qu biu din ph thu c X
n
(e
j
) c gii hn trong dng /A(e
j
), trong
A(e
j
) l mt a thc bc p tng ng khi thc hin php bin i z:
( )
1 2
1 2
1 ...
p
p
A z a z a z a z
= + + + + (3.14)
Hnh 3.5 M hnh phn tch LPC cho tn hiu ting ni
Bc ca a thc, p, cn c gi l bc phn tch LPC. Kt qu thu c t khi phn tch
ph LPC l mt vc-t cc h s (cn gi l cc tham s LPC) c th ha (specify) ph ca
mt m hnh ton im cc m ph hp nht vi ph tn hiu gc trn ton khong thi gian
xem xt cc mu tn hiu.
tng ng sau vic s dng m hnh LPC l vic c th xp x mt mu tn hiu ting
ni thi im n bt k, ( ) s n , nh l mt t hp tuyn tnh ca p mu trc . Ni cch
khc:
( ) ( ) ( ) ( )
1 2
1 2 ...
p
s n a s n a s n a s n p + + + (3.15)
Cc h s a
1
, a
2
, , a
p
c gi thit l khng i trong khung phn tch tn hiu. Biu
thc (3.15) c th c vit li thnh ng thc nu ta thm vo mt thnh phn kch thch
(excitation term) Gu(n), ta c:
( ) ( ) ( )
1
p
i
i
s n a s n i Gu n
=
= +
(3.16)
Chng 3: Phn tch ting ni
33
Trong cng thc (3.16), u(n) l thnh phn kch thch chun v G l h s khuch i ca
thnh phn kch thch. Nu xem xt biu thc (316) trong min z chng ta c biu thc:
( ) ( ) ( )
1
p
i
i
i
S z a z S z GU z
=
= +
(3.17)
Hay hm truyn t tng ng l:
( )
( )
( ) ( )
1
1 1
1
p
i
i
i
S z
H z
GU z A z
a z
=
= = =
(3.18)
Hm truyn t (3.18) c th c thc hin bi s khi trong hnh 3.6. S khi
c th c gii thch nh sau. Ngun kch thch chun ha u(n) c nhn vi h s khuch
i G tr thnh u vo ca mt h thng ton im cc H(z)=1/A(z) to ra tn hiu ting
ni s(n). Chng ta bit rng hm kch thch thc ca tn hiu ting ni l dy xung bn tun
hon i vi tn hiu m hu thanh v l ngun nhiu ngu nhin i vi tn hiu m v thanh.
T thc t ny, d dng xy dng c mch tng hp tn hiu ting ni da vo m hnh
phn tch LPC nh trong hnh 3.7. Trong s tng hp ting ni s dng m hnh phn tch
LPC, ngun kch thch c chn tng ng ph hp vi tn hiu m hu thanh hay v thanh
nh mt chuyn mch. H s khuch i G ca tn hiu c c lng t tn hiu ting ni.
Mch lc s H(z) c iu khin bi cc tham s ca b my pht m tng ng vi tn hiu
ting ni c to ra. Ni mt cch c th, cc tham s ca m hnh tng hp ny l cc phn
loi (classification) m hu thanh hay v thanh, khong chu k pitch (pitch period) ca tn
hiu, tham s khuch i, cc h s ca b lc a
k
. Tt c cc tham s ny thay i chm
theo thi gian.
Hnh 3.6 M hnh d on m phng ting ni
Gi s rng t hp tuyn tnh ca cc mu trc thi im xem xt l mt c lng ca
tn hiu, k hiu l ( ) s n :
( ) ( )
1
p
k
k
s n a s n k
=
=
(3.19)
Khi , sai s d tnh e(n) s c tnh l:
( ) ( ) ( ) ( ) ( )
1
p
k
k
e n s n s n s n a s n k
=
= =
(3.20)
Hay ni cch khc, hm truyn t sai s tng ng l:
( )
( )
( )
1
1
p
k
k
k
E z
A z a z
S z
=
= =
(3.21)
Chng 3: Phn tch ting ni
34
T y ta thy rng, nu tn hiu ting ni c to ra t s mch 3.6 th sai s d on
e(n) s bng tn hiu kch thch Gu(n).
Vn t ra i vi phng php phn tch LPC l xc nh c tp cc h s a
k
mt
cch trc tip t tn hiu ting ni sao cho tnh cht ph ca mch lc trong s 3.7 tng
ng vi ph ca tn hiu ting ni trong khong ca s phn tch. V c tnh ph ca tn
hiu ting ni lun thay i theo thi gian, cc h s d on thi im n xc nh phi l
nhng gi tr c c lng t cc on ngn hn ca tn hiu ting ni xung quanh thi
im n. T y chng ta thy phng php tip cn c bn l tm c mt tp cc h s d
on (predictor coefficients) sao cho chng lm ti thiu ha sai s d on trung bnh bnh
phng trn ton on ngn hn ca tn hiu phn tch. Thng th phng php phn tch
ph theo cch ny c thc hin trn cc khung tn hiu lin tip m khong cch gia cc
khung vo khong bc ca 10ms.
Hnh 3.7 M hnh tng hp ting ni dng LPC
xy dng biu thc v t tm ra c cc h s d on thch hp, chng ta nh
ngha cc khung tn hiu ngn hn v tng ng l cc sai s ngn hn:
( ) ( )
n
s m s n m = + (3.22)
( ) ( )
n
e n e n m = + (3.23)
Chng ta cn ti thiu ha tn hiu sai s trung bnh bnh phng thi im n:
( )
2
n n
m
e m =
(3.24)
Biu thc (3.24) c th c vit li bng cch s dng cc nh ngha e
n
(m) v s
n
(m) nh
sau:
( ) ( )
2
1
p
n n k n
m k
s m a s m k
=
(
=
(
(3.25)
tm cc tiu ca (3.25), chng ta ly o hm ln lt theo cc h s a
k
v cho chng
bng khng:
Chng 3: Phn tch ting ni
35
( ) 0 1, 2,...,
n
k
k p
a
= =
(3.26)
Khi chng ta c:
( ) ( ) ( ) ( )
1
p
n n k n n
m k m
s m i s m a s m i s m k
=
=
(3.27)
Chng ta bit rng h s c dng ( ) ( )
n n
s m i s m k
(3.28)
Chng ta c th thu gn biu thc (3.27) nh sau:
( ) ( )
1
, 0 ,
p
n k n
k
i a i k
=
=
(3.29)
Biu thc (3.29) biu din h thng gm p biu thc ca p bin s. D c gi tr sai s
trung bnh bnh phng ti thiu,
n
c tnh nh sau:
( ) ( ) ( )
( ) ( )
2
1
1
0, 0 0,
p
n n k n n
m k m
p
n k n
k
s m a s m s m k
a k
=
=
=
=
(3.30)
Chng ta thy rng, gi tr sai s trung bnh bnh phng ti thiu c cha mt thnh phn
c nh
n
(0,0) v cc thnh phn khc ph thuc vo cc h s d on.
tm cc h s d on ti u
k
a trc ht chng ta phi tnh
n
(i,k) (1 i p v 0 k
p) v sau gii h (3.29) ng thi ca p biu thc. Trong thc t, vic gii h v tnh ton
cc thnh phn ph thuc rt nhiu vo khong thi gian m c s dng nh ra khung
tn hiu phn tch v vng m trn sai s trung bnh bnh phng c c lng. C hai
phng php chun nh ra khong thch hp cho tn hiu ting ni: phng php s dng
s t tng quan; v phng php s dng covariance.
Phng php s dng hm t tng quan xut pht trc tip t vic nh ra khong gii
hn m trong t hp tuyn tnh sao cho on tn hiu ting ni s
n
(m) bng 0 ngoi khong 0
m N-1. iu ny tng ng vi vic gi thit tn hiu ting ni s(n+m) c nhn vi
hm ca s w(m) hu hn c gi tr bng 0 ngoi khong 0m N-1. Ni mt cch khc,
mu tn hiu ting ni lm ti thiu ha sai s trung bnh bnh phng c th biu din
di dng:
( )
( ) ( )
[ ]
w 0 1
0 0, 1
n
s n m m m N
s m
m N
+
(3.31)
T cng thc (3.31), khi m<0 tn hiu sai s e
n
(m) bng 0 v khi s
n
(m)=0. Mt khc,
cng tng t khi m>N-1+p s khng c sai s d on bi v khi ta cng c s
n
(m)=0. Tuy
nhin trong vng m=0 (tc l t m=0 n m=p-1) tn hiu thu c sau khi thc hin vic ly
ca s c th c d on t cc mu trc , m mt s trong chng c th bng 0. V
Chng 3: Phn tch ting ni
36
nh vy, kh nng sai s d on tng i ln c th tn ti trong vng ny. Ti vng m=N-
1 (tc l t m=N-1 n m=N-1+p) kh nng c th tn ti sai s d on ln cng c th tn
ti bi v cc tn hiu thu c t qu trnh ly ca s bng 0 c d on t mt vi mu
cui cng khc khng ca tn hiu. Vi tn hiu m hu thanh,cc hiu ng tim nng tn ti
sai s d on ln u hoc cui khung tn hiu th hin r rng khi bt u chu k ca
pitch hoc rt gn vi cc im m=0 hoc m=N-1. i vi tn hiu m v thanh th hin
tng ny gn nh c loi b bi v khng c phn tn hiu no nhy cm (position
sensitive). Cc hin tng ny cng vi tn hiu ca s c minh ha trong cc hnh 3.8-
3.10.
Hnh 3.8 Minh ha trng hp sai s d on ln u khung vi tn hiu m hu thanh
Chng 3: Phn tch ting ni
37
Hnh 3.9 Minh ha trng hp sai s d on ln cui khung vi tn hiu m hu thanh
Hnh 3.10 Minh ha trng hp sai s d oan ln vi tn hiu m v thanh
Mc ch ca vic ly ca s l nhm chnh (taper) tn hiu gn cc im m=0 v m=N-1
lm ti thiu ha cc sai s cc vng bin ny.
Vi vic nh ngha khong tn hiu sau php ly qua ca s, chng ta c th vit biu thc
tnh sai s trung bnh bnh phng nh sau:
Chng 3: Phn tch ting ni
38
( )
1
2
0
N p
n n
m
e n
+
=
=
(3.32)
Khi
n
(i,k) c th c vit li l:
( ) ( ) ( ) ( )
1
0
, 1 , 0
N p
n n n
m
i k s m i s m k i p k p
+
=
=
(3.33)
Bng cch thay ch s biu thc trn c th c vit di dng:
( ) ( ) ( )
( )
( )
1
0
, 1 , 0
N i k
n n n
m
i k s m s m i k i p k p
=
= +
(3.34)
Ta thy biu thc (3.34) l mt hm ch ph thuc vo hiu i-k ch khng phi ph thuc
hai bin s c lp i v k. Do , hm covariance
n
(i,k) tr thnh hm t tng quan:
( ) ( )
( ) ( )
( )
( )
1
0
,
1 , 0
n n
N i k
n n
m
i k i k
s m s m i k i p k p
=
=
= +
(3.35)
Do hm t tng quan l hm i xng, tc l ( ) ( )
n n
k k = , biu thc tng ng ca
LPC c th c biu din l:
( ) ( ) ( )
1
1
p
n k n
k
i k a i i p
=
=
(3.36)
Nu biu din di dng ma trn chng ta c:
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( )
( )
( )
( )
1
2
3
0 1 2 1 1
1 0 1 2 2
2 1 0 3 3
1 2 3 0
n n n n n
n n n n n
n n n n n
p n n n n n
a p
a p
a p
a p p p p
( ( (
( ( (
( ( (
( ( (
=
( ( (
( ( (
( ( (
(3.37)
Trong cng thc trn, ma trn cc thnh phn t tng quan l mt ma trn Toeplitz (ma
trn i xng vi cc thnh phn ng cho chnh bng nhau), do vic gii h phng
trnh trn d dng thc hin c bng vic p dng cc thut ton tnh ton hiu qu bit.
Phng php s dng covariance l mt phng php khc vi phng php s dng hm
t tng quan cp trn. Phng php ny c nh khong m trn sai s trung bnh
bnh phng c tnh trong khong 0 m N-1 v s dng khung tn hiu trong khong
mt cch trc tip m khng thc hin php ly ca s.
Sai s trung bnh bnh phng khi c tnh l:
( )
1
2
0
N
n n
m
e m
=
=
(3.38)
V covariance c tnh bi:
( ) ( ) ( ) ( )
1
0
, 1 , 0
N
n n n
m
i k s m i s m k i p k p
=
=
(3.39)
Hoc bng cch i ch s:
Chng 3: Phn tch ting ni
39
( ) ( ) ( ) ( )
1
0
, 1 , 0
N i
n n n
m
i k s m s m i k i p k p
=
= +
(3.40)
thy rng vic tnh ton theo biu thc (3.40) lin quan n cc mu tn hiu s
n
(m) t
thi im m=-p n m=N-1-p khi i=p, v lin quan n cc mu s
n
(m+i-k) t thi im 0 n
thi im N-1. Do , khong tn hiu cn thit c th tnh ton hon thin l t s
n
(-p) n
s
n
(N-1). Ni mt cch khc, vic tnh ton cn n cc mu bn ngoi khong ti thiu sai s
gm s
n
(-p), s
n
(-p+1), , s
n
(-1).
Bng vic s dng khong tn hiu m rng tnh ton cc gi tr covariance
n
(i,k),
biu thc phn tch LPC dng ma trn c biu din nh sau:
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( )
( )
( )
( )
1
2
3
1,1 1, 2 1, 3 1, 1, 0
2,1 2, 2 2, 3 2, 2, 0
3,1 3, 2 3, 3 3, 4 3, 0
,1 , 2 , 3 , , 0
n n n n n
n n n n n
n n n n n
p n n n n n
a p
a p
a
a p p p p p p
( ( (
( ( (
( ( (
( ( (
=
( ( (
( ( (
( ( (
(3.41)
Ma trn cc h s covariance l mt ma trn i xng (v
n
(i,k)=
n
(k,i)) tuy nhin
khng phi ma trn Toeplitz. Vic gii h phng trnh trn c th thc hin bng vic s
dng thut ton phn tch Cholesky. Trong thc t, m hnh phn tch LPC biu din dng
covariance y thng khng c s dng trong cc h thng nhn dng tn hiu ting
ni.
3.7. Phng php phn tch cepstral
Khi nim cepstrum c a ra bi Bogert, Healy v Tukey. Cepstrum c nh ngha
l bin Fourier ngc (IFT) ca l-ga-rt ln bin ph ca tn hiu. Ni cc khc,
cepstrum ca mt tn hiu vi thi gian ri rc c cho bi cng thc:
( ) ( )
1
log
2
j j
n n
c m S e e d
(3.42)
y, log|S
n
(e
j
)| l l-ga-rt ca ln bin (magnitude) ca FT tn hiu. Khi nim
(3.42) c th c m rng thnh cepstrum phc nh sau:
( ) ( )
1
log{S }
2
j j m
n n
c m e e d
(3.43)
Trong cng thc (3.43), log{S
n
(e
j
)} l l-ga-rt phc ca S
n
(e
j
) v c nh ngha nh
sau:
( ) ( ) ( ) ( )
=
=
(3.45)
( ) ( ) ( )
log arg X k S k j S k = + (
(3.46)
( ) ( )
1
2 /
0
1
N
j kn N
n
s n X k e
N
=
=
(3.47)
3.8. Mt s phng php xc nh tn s Formant
Formant ca tn hiu ting ni l mt trong cc tham s quan trng v hu ch c ng dng
rng ri trong nhiu lnh vc chng hn nh trong vic x l, tng hp v nhn dng ting ni.
Cc formant l cc tn s cng hng ca tuyn m (vocal tract), n thng c th hin
trong cc biu din ph chng hn nh trong biu din spectrogram nh l mt vng c nng
lng cao, v chng bin i chm theo thi gian theo hot ng ca b my pht m. S d
formant c vai tr quan trng v l mt tham s hu ch trong cc nghin cu x l ting ni
l v cc formant c th miu t c cc kha cnh quan trng nht ca ting ni bng vic
s dng mt tp rt hn ch cc c trng. Chng hn trong m ha ting ni, nu s dng cc
tham s formant biu din cu hnh ca b my pht m v mt vi tham s ph tr biu
din ngun kch thch, chng ta c th t c tc m ha thp n 2,4kbps.
Nhiu nghin cu v x l v nhn dng ting ni ch ra rng cc tham s formant l
ng c vin tt nht cho vic biu din ph ca b my pht m mt cch hiu qu. Tuy nhin
vic xc nh cc formant khng n gin ch l vic xc nh cc nh trong ph bin bi
v cc nh ph ca tn hiu ra ca b my pht m ph thuc mt cch phc tp vo nhiu
yu chng hn nh cu hnh b my pht m, cc ngun kch thch, ...
Cc phng php xc nh formant lin quan n vic tm kim cc nh trong cc biu
din ph, thng l t kt qu phn tch ph theo phng php STFT hoc m ha d on
tuyn tnh (LPC).
a) Xc nh formant t phn tch STFT
Cc phn tch STFT tng t v ri rc tr thnh mt cng c c bn cho nhiu pht
trin trong phn tch v tng hp tn hiu ting ni.
D dng thy STFT trc tip cha cc thng tin v formant ngay trong bin ph. Do ,
n tr thnh mt c s cho vic phn tch cc tn s formant ca tn hiu ting ni.
b) Xc nh formant t phn tch LPC
Cc tn s formant c th c c lng t cc tham s d on theo mt trong hai cch.
Cch th nht l xc nh trc tip bng cch phn tch nhn t a thc d on v da trn
cc nghim thu c quyt nh xem nghim no tng ng vi formant. Cch th hai l
s dng phn tch ph v chn cc formant tng ng vi cc nh nhn bng mt trong cc
thut ton chn nh bit.
Mt li im khi s dng phng php phn tch LPC phn tch formant l tn s
trung tm ca cc formant v bng tn ca chng c th xc nh c mt cch chnh xc
thng qua vic phn tch nhn t a thc d on. Mt php phn tch LPC bc p c chn
Chng 3: Phn tch ting ni
41
trc, th s kh nng ln nht c th c cc im cc lin hp phc l p/2. Do , vic gn
nhn trong qu trnh xc nh xem im cc no tng ng vi cc formant n gin hn cc
phng php khc. Ngoi ra, vi cc im cc bn ngoi thng c th d dng phn tch
trong phn tch LPC v bng tn ca chng thng rt ln so vi bng tn thng thng ca
cc formant tn hiu ting ni.
3.9. Mt s phng php xc nh tn s c bn
Tn s c bn F
0
l tn s giao ng ca dy thanh. Tn s ny ph thuc vo gii tnh v
tui. F
0
ca n thng cao hn ca nam, F
0
ca ngi tr thng cao hn ca ngi gi.
Thng vi ging ca nam, F
0
nm trong khong t 80-250Hz, vi ging ca n, F
0
trong
khong 150-500Hz. S bin i ca F
0
c tnh quyt nh n thanh iu ca t cng nh ng
iu ca cu. Cu hi t ra l lm th no xc nh tn c c bn (fundamental frequency).
Mt s phng php xc nh tn s c bn c th k n l: Phng php s dng hm t
tng quan, phng php s dng hm vi sai bin trung bnh; Phng php s dng b lc
o v hm t tng quan; Phng php x l ng hnh (homomophic).
a) S dng hm t tng quan
Hm t tng quan
n
(k) s t cc gi tr cc khi tng ng ti cc im l bi ca chu
k c bn ca tn hiu. Khi cc tn s c bn l tn s xut hin ca cc nh ca
n
(t). Bi
ton tr thnh bi ton xc nh chu k hm t tng quan.
b) S dng hm vi sai bin trung bnh (AMDF)
Nh cp nu dy s(n) tun hon vi chu k T th hm AMDF M
n
s trit tiu ti cc
gi tr t l bi ca s T. Do , chng ta ch cn xc nh hai im cc tiu gn nhau nht v
t c th xc nh c chu k ca dy v t suy ra tn s c bn.
c) S dng tc tr v khng - zero crossing rate
Khi xem xt cc tn hiu vi thi gian ri rc, mt ln qua im khng ca tn hiu xy ra
khi cc mu cnh nhau c du khc nhau. Do vy, tc qua im khng ca tn hiu l mt
o lng n gin ca tn s ca tn hiu. Ly v d, mt tn hiu hnh sin c tn s F
0
c
ly mu vi tn s F
s
s c F
s
/F
0
mu trong mt chu k. V mi chu k c hai ln qua im
khng nn tc trung bnh qua im khng l Z
n
=2F
0
/F
s
. Nh vy, tc qua im khng
trung bnh cho l mt cch nh gi tng i v tn s ca sng sin.
d) Phng php s dng STFT
T kt qu phn biu din Fourier ca tn hiu ting ni, d thy rng ngun kch thch ca
tn hiu m hu thanh c tng cng nhng nh nhn v cc nh ny xy ra cc im
l bi s ca tn s c bn. y chnh l nguyn l c bn ca mt trong cc phng php
xc nh tn s c bn.
Chng 3: Phn tch ting ni
42
Hnh 3.11 S nn tn s
Xt biu thc ph tch cc hi (harmonic) nh sau:
( ) ( )
1
K
j j r
n n
r
P e S e
=
=
(3.48)
Nu ly l-ga-rt ca biu thc (3.48), thu c ph tch cc hi trong thang l-ga-rt:
( ) ( )
1
2 log
K
j j r
n n
r
P e S e
=
=
(3.49)
Hm
( )
j
n
P e
trong cng thc (3.49) l mt tng ca K ph nn tn s ca |S
n
(e
j
)|. Vic
s dng hm trong cng thc (3.49) xut pht t nhn xt rng vi tn hiu m hu thanh,
vic nn tn s bi cc h s nguyn s lm cc hi ca tn s c bn trng vi tn s c bn.
vng tn s gia cc hi, c mt hi ca cc s tn s khc cng b nn trng nhau, tuy
nhin ch ti tn s c bn l c cng c. Hnh 3.11 minh ha nhn xt va nu.
e) S dng phn tch Cepstral
Trong phn tch cepstral ngi ta quan st thy rng, vi tn hiu m hu thanh, c mt
nh nhn ti chu k c bn ca tn hiu. Tuy nhin vi tn hiu m v thanh th nh nhn
ny khng xut hin. Do , phn tch cepstral c th c s dng nh mt cng c c bn
dng xc nh xem mt on tn hiu ting ni l tn hiu m v thanh hay hu thanh, v
xc nh chu k c bn ca tn hiu m hu thanh. Phng php s dng phn tch cepstral
c lng tn s c bn kh n gin. Trc ht cc cepstrum c tnh ton v tm kim
Chng 3: Phn tch ting ni
43
nh nhn trong mt khong ln cn ca chu k phng on. Nu nh cepstrum ti ln
hn mt ngng nh trc th tn hiu ting ni a vo c kh nng ln l tn hiu m hu
thanh v v tr nh l mt c lng chu k tn hiu c bn (cng tc l xc nh c tn
s c bn).
Hnh 3.12 minh ha vic s dng phng php phn tch cepstral xc nh tn hiu m
v thanh v hu thanh cng vi xc nh tn s c bn ca m hu thanh. Pha bn tri l dy
cc l-ga ph ngn hn (cc ng thay i rt nhanh theo thi gian), pha bn phi l cc dy
cepstra tng ng c tnh ton t cc l-ga ph pha bn tai tri. Cc dy l-ga ph v
cepstra tng ng l cc on lin tip chiu di 50ms thu c t hm ca s dch 12,5ms
mi bc (ngha l dch khong 100 mu tn s ly mu 800mu/giy). T hnh v, chng
ta thy cc dy 1-5, ca s tn hiu ch bao gm tn hiu m v thanh (khng xut hin nh,
s thay i ph rt nhanh v xy ra ngu nhin khng c cu trc chu k) trong khi cc dy 6
v 7 bao gm c tn hiu m v thanh v hu thanh. Cc dy 8-15 ch bao gm tn hiu m
hu thanh. D dng thy nh cepstrum ti tn s ng vi 11-12ms tn hiu m hu thanh. V
nh vy, tn s ca nh l mt c lng chnh xc tn s c bn trong khong tn hiu hu
thanh.
Hnh 3.12 L-ga-rt cc thnh phn hi trong ph tn hiu
Chng 3: Phn tch ting ni
44
3.10. Bi thc hnh phn tch ting ni
S dng my tnh c nhn v phn mm Matlab (hoc cc ngn ng lp trnh khc) thc
hin cc cng vic sau:
Vi cng mt ni dung thng tin, cc thnh vin trong nhm ln lt pht m (c/ni) v
ghi m. Lu tp nh dng th (*.wav).
S dng phn mm Matlab (hoc cc ngn ng lp trnh khc) v kin thc hc trong
chng ny:
Xc nh tn s c bn
Xc nh tn s ca Formant u tin ca mi thnh vin
Lp bn phn b ca cc nguyn m trong ting Vit.
Chng 4: Tng hp ting ni
45
Chng 4: Tng hp ting ni
4.1. M u
Trc y khi nim "tng hp ting ni" thng c dng ch qu trnh to m thanh
ting ni mt cch nhn to t my da theo nguyn l m phng c quan pht m ca ngi.
Tuy nhin ngy nay, cng vi s pht trin ca khoa hc cng ngh, khi nim ny c
m rng bao gm c qu trnh cung cp cc thng tin dng ting ni t my trong cc bn
tin c to dng mt cch linh ng ph hp cho nhu cu no . Cc ng dng ca cc
h thng tng hp ting ni ngy nay rt rng ri, t vic cung cp cc thng tin dng ting
ni, cc my c cho ngi m, nhng thit b h tr cho ngi gp kh khn trong vic giao
tip,...
4.2. Cc phng php tng hp ting ni
4.2.1 Tng hp trc tip
Mt phng php n gin thc hin vic tng hp cc bn tin l phng php tng hp
trc tip trong cc phn ca bn tin c chp ni bi cc phn (fragment) n v ca
ting ni con ngi. Cc n v ting ni thng l cc t hoc cc cm t c lu tr v
bn tin ting ni mong mun c tng hp bng cch la chn v chp ni cc n v thch
hp. C nhiu k thut trong vic tng hp trc tip ting ni v cc k thut ny c phn
loi theo kch thc ca cc n v dng chp ni cng nh nhng loi biu din tn hiu
dng chp ni. Cc phng php ph bin c th kt n l: phng php chp ni t,
chp ni cc n v t con (m v sub-word unit), chp ni cc phn on dng sng tn hiu.
a) Phng php tng hp trc tip n gin
Phng php n gin nht to cc bn tin ting ni l ghi v lu tr ting ni ca con
ngi theo cc n v t ring l khc nhau v sau chn pht li cc t theo th t mong
mun no . Phng php ny c a vo s dng trong h thng in thoi ca nc
Anh t nhng nm 36 ca th k trc, t nhng nm 60 ca th k trc thng c dng
trong mt s h thng thng bo cng cng, v ngy nay vn cn c mt nhiu h thng
qun l in thoi trn th gii. H thng phi lu tr y cc thnh phn ca cc bn tin
cn thitt phi ti to v lu trong mt b nh. B tng hp ch lm nhim v kt ni cc n
v yu cu cu thnh bn tin li vi nhau theo mt th t no m khng phi thay i hay
bin i cc thnh phn ring r.
Cht lng ca bn tin ting ni c tng hp theo phng php ny b nh hng bi
cht lng ca tnh lin tc ca cc c trng m hc (bin ph, bin , tn s c bn, tc
ni) ca cc n v c chp ni. Phng php tng hp ny t ra hiu qu khi cc bn tin
c dng mt danh sch chng hn nh mt dy s c bn, hoc cc khi bn tin thng xut
hin mt v tr nht nh trong cu. iu ny d hiu bi v iu cho php d dng m
bo rng bn tin c pht ra c tnh t nhin v mt thi gian v cao . Khi c yu cu mt
cu trc cu c bit no m trong cc t thay th nhng v tr nht nh trong cu th
cc t phi c ghi li ng nh th t ca n trong cu nu khng n s khng ph
hp vi ng iu ca cu. Chng hn vi cc dy s c bn cng cn thit phi ghi li chng
hai dng: mt tng ng vi v tr cui cu v mt dng khng. iu ny l v cu trc pitch
ca mi n v ting ni thay i ty theo v tr ca t trong cu. Nh vy, qu trnh bin son
Chng 4: Tng hp ting ni
46
l mt qu trnh rt tn thi gian v cng sc. Ngoi ra vic chp ni trc tip cc n v ting
ni gp rt nhiu kh khn trong vic din t s nh hng t nhin gia cc t, cng nh
ng iu v nhp iu ca cu. Mt hn ch na phi k n l kch thc ca b nh cho cc
ng dng vi s lng cc bn tin ln l rt ln.
Yu cu b nh lu tr ln c th c phn no gii quyt bng vic s dng phng
php m ha tc thp cho cc n v ting ni trc khi thc hin vic lu tr. Tuy nhin
c phng php s dng lu tr trc tip hoc m ha ca cc n v ln (t, cm t) ca
ting ni, s lng bn tin c th tng hp c rt hn ch. tng s lng bn tin c th
tng hp c, cc n v t c th c chia nh hn thnh n v t con, diphone,
demisyllable, syllable... c ghi v lu tr. Tuy nhin khi n v ting ni cng c chia
nh th cht lng bn tin tng hp c cht lng cng b gim.
Hnh 4.1 minh ha s so snh spectrogram ca cu tng hp c theo phng php tng
hp trc tip n gin v bn tin nguyn thy.
Hnh 4.1 So snh kt qu t bn tin tng hp trc tip v bn tin nguyn thy
b) Phng php tng hp trc tip t cc phn on dng sng
Nh cp phn trn, phng php tng hp trc tip n gin gp phi hn ch trong
vic khi phc tc v tnh t nhin (nhn, nhp, ng iu) ca bn tin c tng hp. Vn
ny c th c gii quyt bng cch s dng phng php tng hp t cc phn on
dng sng hay cn gi l phng php tng hp chng v thm cc on sng theo di
pitch. Xem xt bi ton chp ni hai phn on ca dng sng ca tn hiu ca nguyn m.
Chng ta thy rng s khng lin tc trong dng sng tng hp s c gim nh ti thiu
nu vic chp ni xy ra cng v tr ca mt chu k glottal ca c hai phn on. V tr ny
thng l v tr tng ng vi vng c bin tn hiu nh nht khi p ng tuyn m vi
xung glottal hin ti c s suy gim ln v ch ngay trc mt xung tip theo. Ni cch khc,
hai phn on tn hiu c chp ni theo kiu ng b pitch (pitch-synchronous manner).
Chng 4: Tng hp ting ni
47
Phng php ph bin thc hin vic ny l phng php TD-PSOLA (Time domain Pitch
Synchronous Overlap Add).
TD-PSOLA thc hin vic nh du cc v tr tng ng vi s ng li ca dy thanh
(tc l xung pitch) trong dng sng tn hiu ting ni. Cc v tr nh du ny c s dng
to ra cc phn on ca s ca dng sng tn hiu cho mi chu k. Vi mi chu k, hm
ca s phi c chnh trng vi trung tm ca vng c bin tn hiu cc i v hnh dng
ca hm ca s phi c chn thch hp. Ngoi ra, di hm ca s phi di hn mt chu
k nhm to ra mt s chng ln nh gia cc ca s tn hiu cnh nhau.
Hnh 4.2 minh ha nguyn l lm vic ca phng php TD-PSOLA trong s dng
hm ca s Hanning.
Hnh 4.2 Nguyn l phng php TD-PSOLA
T minh ha, chng ta thy rng, bng cch chp ni dy cc phn on ca s tn hiu
sng theo cc v tr tng i cho trc theo cc im du pitch phn tch, chng ta c th
ti to mt cch kh chnh xc bn tin theo mong mun. Ngoi ra, bng cch thay i cc v
tr tng i v s lng cc im du pitch, chng ta c th lm thay i pitch v thi gian
ca bn tin c tng hp.
4.2.2 Tng hp ting ni theo Formant
Phng php tng hp theo Formant l phng php tng hp ch thc u tin c
pht trin v l phng php tng hp ph bin cho n tn nhng nm u ca thp k $80$.
Phng php tng hp theo Formant cn c gi l phng php tng hp theo lut. N s
dng cc phng php m-un (modular), da trn m hnh (model-based), mi quan h m
thanh-m tit gii cc bi ton tng hp ting ni. Trong phng php ny, m hnh ng
m thanh c s dng mt cch t bit sao cho cc thnh phn iu khin ca ng d dng
Chng 4: Tng hp ting ni
48
c lin h vi cc tnh cht ca mi quan h m thanh-m tit (acoustic-phonetic) v c th
quan st c mt cch d dng.
Hnh 4.3 m t s tng qut mt h thng tng hp theo formant. Nguyn l tng qut
ca h thng c m t nh sau. m thanh c pht ra t mt ngun. i vi cc nguyn
m v cc ph m hu thanh th ngun m ny c th c to ra hoc y bng mt hm
tun hon trong min thi gian hoc bng mt dy p ng xung a qua mch lc tuyn tnh
m phng khe thanh (glottal LTI filter). i vi cc m v thanh th ngun m ny c to
ra t mt b pht nhiu ngu nhin. i vi cc m tc th ngun c bn ny c to ra bng
cch kt hp ngun cho m hu thanh v ngun cho m v thanh. Tn hiu m thanh t ngun
m c bn c a vo m hnh tuyn m (vocal tract). ti to tt c cc formant, m
phng khoang ming v khoang mi c xy dng song song ring bit. Do , khi tn hiu
i qua h thng s i qua m hnh khoang ming, nu c yu cu v cc m mi th cng i
qua h thng m hnh khoang mi. Cui cng kt qu cc thnh phn m thanh to ra t cc
m hnh khoang ming v mi c kt hp li v c a qua h thng pht x, h thng
ny m phng cc c tnh lan truyn v c tnh ti ca mi v mi.
Hnh 4.3 S phng php tng hp theo formant
Theo l thuyt mch lc, mt formant c th c to ra bng cc s dng mt mch lc
IIR bc hai vi hm truyn:
( )
1 2
1 2
1
1
H z
a z a z
=
(4.1)
Trong hm truyn t c th phn tch thnh:
( )
( )( )
1 1
1 2
1
1 1
H z
p z p z
=
(4.2)
Chng ta bit rng, xy dng mch lc vi cc h s a
1
v a
2
l thc th cc im cc
phi c dng l cp lin hp phc. Cn ch rng mt b lc bc hai nh trn s c th
ph vi hai formant, tuy nhin ch c mt trong hai nm phn tn s dng. Do , chng ta
c th coi b lc trn to ra mt formant n l c ch. Cc im cc c th quan st c
trn th, trong ln bin ca cc im cc quyt nh bng tn v bin ca cng
hng. ln bin cng nh th cng hng cng phng, ngc li, ln bin cng
ln th cng hng cng nhn.
Chng 4: Tng hp ting ni
49
Nu biu din cc im cc trong ta cc vi pha v bn knh r v ch n nhn xt
cp im cc l lin hp phc chng ta c th vit hm truyn t trong cng thc (4.1) nh
sau:
( )
( )
2 2
1
1 2 os
H z
r c r z
=
+
(4.3)
T y chng ta thy cng ta c th to ra mt formant vi bt c tn s mong mun no
bng vic s dng trc tip gi tr thch hp ca . Tuy vy vic iu khin bng tn mt cch
trc tip kh khn hn. V tr ca formant s thay i hnh dng ca ph do mt mi quan
h chnh xc cho mi trng hp l khng th t c. Cng cn ch rng, nu hai im
cc gn nhau, chng s c nh hng n vic kt hp thnh mt nh cng hng duy nht
v iu ny li gy kh khn cho vic tnh ton bng tn. Thc nghim cho thy mi lin h
gia bng tn chun ha ca formant v bn knh ca im cc c th xp x hp l bi:
( )
2ln B r = (4.4)
Khi ta c th biu din hm truyn t theo hm ca tn s chun ha
F v bng tn
chun ha
B ca formant nh sau:
( )
( )
2 1 2 2
1
1 2 os 2
B B
H z
e c F z e z
=
+
(4.5)
y, cc tn s chun ha
F v bng tn chun ha
B c th xc nh tng ng bng
cch chia F v B cho tn s ly mu F
s
.
c th to ra nhiu formant chng ta c th thc hin bng mt b lc m hm
truyn t l tch ca mt s hm truyn t bc hai. Ni mt cch khc, hm truyn cho
tuyn m (vocal tract) c dng:
( ) ( ) ( ) ( ) ( )
1 2 3 4
H z H z H z H z H z = (4.6)
Trong H
i
(z) l hm ca tn s F
i
v bng tn B
i
ca formant th i.
Tng ng biu thc quan h u vo u ra trong min thi gian c dng:
( ) ( ) ( ) ( ) ( )
1 2 8
1 2 ... 8 y n x n a y n a y n a y n = + + + + (4.7)
Mt cch tng t, chng ta c th xy dng h thng m phng khoang mi. Cc biu
thc (4.6) v (4.7) biu din k thut tng hp formant theo s ni tip hay cn gi l s
cascade.
Mt k thut khc l tng hp formant song song. Phng php tng hp formant song
song m phng mi formant ring r. Ni cch khc, mi m hnh c mt hm truyn H
i
(z)
ring r. Trong qu trnh to tn hiu ting ni cc ngun tn hiu c a vo cc m hnh
mt cch ring r. Sau , cc tn hiu t cc m hnh y
i
(n) c tng hp li.
( ) ( ) ( )
1 2
... y n y n y n = + + (4.8)
Hnh 4.4 minh ha cu hnh tng qut ca phng php tng hp ni tip v song song.
Chng 4: Tng hp ting ni
50
Hnh 4.4 Cc cu hnh ca phng php tng hp nhiu formant
Phng php tng hp theo s ni tip c li im l vi mt tp cc gi tr formant
cho trc, chng ta c th d dng xy dng cc hm truyn t v biu thc quan h u vo
u ra (cng thc vi sai - difference equation). Vic tng hp ring r cc formant trong
phng php tng hp song song cho php chng ta xc nh mt cch chnh xc tn s ca
cc formant.
Mc d l mt phng php tng hp n gin v thng mang li tn hiu m thanh r,
phng php tng hp theo formant kh t c tnh t nhin ca tn hiu ting ni. iu
ny l do m hnh ngun v m hnh chuyn i b n gin ha qu mc v b qua
nhiu yu t ph tr gp phn to ra c tnh ng ca tn hiu.
B tng hp Klatt
B tng hp Klatt l mt trong cc b tng hp tin ni da trn formant phc tp nht
c pht trin. S ca b tng hp ny c trnh by trong hnh 4.5 trong c s dng
c cc h thng cng hng song song v ni tip.
Trong s cc khi R
i
tng ng vi cc b to tn s cng hng formant th i; cc hp
A
i
iu khin bin tn hiu tng ng. B cng hng c thit lp lm vic tn s
10kHz vi 6 formant chnh c s dng.
Cn ch rng, trong thc t cc b tng hp formant thng s sng tn s ly mu
khong 8kHz hoc 10kHZ. iu ny khng hn bi mt l do no c bit lin quan n
nguyn tc v cht lng tng hp m bi v s hn ch v khng gian lu tr, tc x l
v cc yu cu u ra khng cho php thc hin vi tc ly mu cao hn. Mt im khc
cng cn ch l, cc nghin cu chng minh rng ch c ba formant u tin l
phn bit tn hiu m thanh, do vic s dng 6 formant th cc formant bc cao n gin
c s dng tng thm tnh t nhin cho tn hiu tng hp c.
Chng 4: Tng hp ting ni
51
Hnh 4.5 S khi b tng hp Klatt
4.2.3 Tng hp ting ni theo phng php m phng b my pht m
Mt cch hin nhin, tng hp ting ni th chng ta cn tm mt cch no m phng
b my pht m ca chng ta. y cng l nguyn l ca cc "my ni" c in m ni ting
trong s l my do Von Kempelen ch to. Cc b tng hp ting ni c in theo nguyn
l ny thng l cc thit b c hc vi cc ng, ng thi, ... hot ng ta h cc dng c m
nhc, tuy nhin vi mt cht hun luyn c th dng to ra tn hiu ting ni nhn bit
c. Vic iu khin hot ng ca my l nh con ngi theo thi gian thc, iu ny
Chng 4: Tng hp ting ni
52
mang li nhiu thun li cho h thng kha cnh con ngi c th s dng cc c ch chng
bn nh thng qua phn hi iu khin v bt chc qu trnh to ting ni t nhin. Tuy
nhin, ngy nay vi nhu cu ca cc b tng hp phc tp hn, cc c my c in r rng l
li thi khng th p ng c.
Cng vi s hiu bit ca con ngi v b my pht m c nng cao, cc b tng hp s
dng nguyn l m phng b my pht m ngy cng phc tp v hon thin hn. Cc hnh
dng ng phc tp c xp x bng mt lot cc ng n gin nh hn. Vi m hnh cc ng
n gin, v chng ta bit c cc c tnh truyn m ca n, chng ta c th s dng xy
dng cc m hnh b my pht m tng qut phc tp.
Mt u in ca phng php tng hp m phng b my pht m l cho php to ra mt
cch t nhin hn to ra ting ni. Tuy nhin, phng php ny cng gp phi mt s kh
khn. Th nht l vic quyt nh lm th no c c cc tham s iu khin t cc
yu cu tn hiu cn tng hp. R rng, kh khn ny cng gp phi trong cc phng php
tng hp khc. Trong hu ht cc phng php tng hp khc, chng hn cc tham s
formant c th tm c mt cch trc tip t tn hiu ting ni thc, chng ta ch n gin
ghi m li ting ni v tnh ton ri xc nh chng. Cn trong phng phng php m
phng b my pht m chng ta s gp kh khn hn v cc tham s v b my pht m ng
n khng th xc nh t vic ghi li tn hiu thc m phi thng qua cc o lng thng qua
chng hn nh X-ray, MRI... Kh khn th hai l vic cn bng gia vic xy dng mt m
hnh m phng chnh xc cao nht ging vi b my pht m sinh hc ca con ngi v mt
m hnh thc tin d thit k v thc hin. C hai kh khn ny cho n nay vn c coi l
thch thc vi cc nh nghin cu. V y cng chnh l l do m cho n nay c rt t cc h
thng tng hp theo nguyn l m phng b my pht m c cht lng so vi cc b tng
hp theo nguyn l khc.
4.3. H thng tng hp ch vit sang ting ni
Vic chuyn i t ch vit sang ting ni (TTS) l mc tiu y tham vng v vn ang
tip tc l tm im ch ca cc nh nghin cu pht trin. TTS c mt nhiu ng dng
phc v cuc sng. Chng hn nh vic cc ng dng truy cp email qua thoi, cc ng dng
c s d liu cho cc dch v h tr ngi m... Mt h thng TTS in hnh c s khi
vi cc thnh phn c minh ha trong hnh 4.6.
Hnh 4.6 S khi mt h thng TTS
Chng 4: Tng hp ting ni
53
T minh ha, chng ta thy rng, h thng TTS c th c trng nh mt qu trnh phn
tch-tng hp 2-giai on. Giai on mt ca qu trnh thc hin vic phn tch ch vit
xc nh cu trc ngn ng n trong . Ch vit u vo thng bao gm cc cm t vit tt,
cc s La M, ngy thng, cng thc, cc du cu...Giai on phn tch ch vit phi c kh
nng chuyn i dng ch vit u vo thnh mt dng chun chp nhn c s dng cho
giai on sau. Cc m t ngn ng dng tru tng ca d liu thu c giai on ny c
th bao gm mt dy phoneme v cc thng tin khc, chng hn nh cu trc nhn, cu trc
c php...Cc m t ny c chuyn i thnh mt bng ghi m tit nh s gip ca mt
t in pht m v cc lut pht m km theo. Giai on th hai thc hin vic tng hp xy
dng dng sng tn hiu da trn cc tham s thu c t giai on trc .
C qu trnh phn tch v tng hp ca mt h thng TTS lin quan n mt lot cc hot
ng x l. Hu ht cc h thng TTS hin i thc hin cc hot ng x l c minh ha
theo kin trc m-un nh trong hnh 4.7.
Hot ng ca s khi c th s lc m t nh sau. Khi dng d liu ch vit c
a vo, mi m-un trch cc thng tin u vo hoc thng tin t cc m-un khc lin quan
n ch vit, v to ra cc cc thng tin u ra mong mun cho vic x l cc m-un tip
theo. Vic trch chuyn c thc hin cho n khi dng tn hiu tng hp cui cng c to
ra. Qu trnh x l v truyn thng tin t m-un ny n m-un khc thng qua mt "ng
c" (engine) x l ring bit. Engine x l iu khin dy cc hot ng c thc thi, v lu
tr mi thng tin dng cu trc d liu thch hp.
Chng 4: Tng hp ting ni
54
Hnh 4.7 S khi kin trc m-un ca mt h thng TTS hin i
a) Phn tch ch vit
Chng ta bit rng, ch vit bao gm cc k t ch v s, cc khong trng, v c th mt
lot cc k t c bit khc. Nh vy bc u tin trong vic phn tch ch vit l vic tin
x l ch vit u vo (bao gm thay th ch s, cc ch vit tt bng dng vit y ca
Chng 4: Tng hp ting ni
55
chng) chuyn chng thnh mt dy cc t. Qu trnh tin x l thng thng cn pht
hin v nh du cc v tr ngt qung ca cu v cc thng tin v nh dng vn bn thch
hp khc chng hn nh ngt on...Cc m-un x l ch vit tip theo s thc hin vic
chuyn dy t thnh cc m t ngn ng. Mt trong cc chc nng quan trng ca cc khi
ny l xc nh pht m tng ng ca cc t ring l. Trong cc ngn ng nh ngn ng
ting Anh, cc quan h gia cc nh vn ca cc t v dng ghi m v (phonemic
transcription) tng ng l mt quan h cc k phc tp. Ngoi ra, mi quan h ny cn c
th khc nhau vi cc t khc nhau c cng cu trc, v d nh pht m ca cm "ough" trong
cc t "through", "though", "bough", "rough" v "cough".
Nh cp khi qut trong phn trn, pht m ca t thng c xc nh nh vic s
dng tng hp ca mt t in pht m v cc lut pht m km theo. Trong cc h thng
TTS trc khia, nhn mnh trong cc pht m xc nh c tun theo lut v bng cch s
dng mt t in cc ngoi l nh cho cc t chung vi cch pht m bt quy tc (chng hn
nh "one", "two", "said", ...). Tuy nhin ngy nay vi s sn c ca b nh my tnh vi gi
thnh r, thng vic xc nh pht m c hon thnh bng cch s dng mt t in pht
m rt ln (c th gm hng vi chc ngn t) m bo rng t bit c pht m mt
cch chnh xc. Mc d vy, cc lut pht m vn cn thit gii quyt vn ny sinh vi
cc t khng bit v cc t vng mi c lin tc thm vo ngn ng, v cng nh khng
th da hon ton vo vic thm vo tt c cc t vng l cc danh t ring trong b t in.
Vic xc nh pht m ca t c th c thc hin mt cch d dng nu cu trc, hay cn
gi l hnh thi hc ngn ng (morphology), ca t c bit trc. Hu ht cc h thng
TTS bao gm c cc phn tch hnh thi ngn ng. Phn tch ny xc nh dng gc (root
form ca mi t), v d dng gc ca "gives" l "give", v trnh s cn thit phi thm c
dng suy ra t dng gc vo trong t in. Mt s phn tch c php ca ch vit cng c th
cn c thc hin nhm xc nh chnh xc pht m ca cc t nht nh no . Chng hn,
trong ting Anh t "live" c pht m khc nhau ph thuc vo n ng vai tr l mt ng
t hay mt tnh t. Cc pht m ca t chng ta xc nh l cc pht m ca cc t khi chng
c ni ring r. Do , mt s iu chnh cn c thc hin kt hp cc hiu ng m
tit (phonetic) xy ra trn vng bin gia cc t, nhm ci thin tnh t nhin ca ting ni
tng hp c.
Ngoi vic xc nh pht m ca dy t, giai on phn tch ch vit cng phi thc hin
vic xc nh cc thng tin lin quan n cch m ch vit s c ni. Thng tin ny, bao
gm vic phn tit tu, du nhn t (mc t), v mu cc ng iu ca cc t khc nhau. Cc
thng tin ny s c s dng to m iu cho ting ni c tng hp. Cc nh du cho
du nht t c th c thm vo cho mi t trong t in, nhng cc lut cng s cn gn
du nht t cho cc t bt k khng tm thy trong t in. Vi mt s t, chng hn nh t
"permit", v c bn c du nht trn cc m tit khc nhau ph thuc vo vic chng c s
dng nh mt danh t hay mt ng t. V do , cc thng tin v ng php cng cn thit
nhm gn cu trc nhn mt cch chnh xc. Kt qu ca mt phn tch c php cng c th
c s dng nhm cc t thnh cc cm t m iu, v t quyt nh cc t no s
nhn ging sao cho mu nhn ging c th c gn cho dy t. Trong khi cu trc c php
cung cp cc u mi hu ch cho vic nhn ging v phn tit tu (v t to m iu),
trong nhiu trng hp, m iu biu hin thc c th khng t c nu khng thc s hiu
Chng 4: Tng hp ting ni
56
ngha ca ch vit. Mc d mt s nh hng ng ngha c s dng, cc phn tch ng
ngha v thc dng y l vt qu cc kh nng ca cc h thng TTS hin ti.
b) Tng hp ting ni
Cc thng tin c trch t cc phn tch ch vit c s dng to ra m iu ca cc
n v ting ni, bao gm c cu trc thi gian, mc nhn mnh ton b v tn s c bn.
M-un cui cng ca h thng TTS s thc hin vic to m thanh ca tn hiu ting ni
bng cch u tin chn cc n v tng hp thch hp s dng, v sau thc hin vic
tng hp cc n v ny vi nhau theo thng tin v m iu bit. Vic tng hp c th
c thc hin bng mt trong cc phng php cp phn trn.
4.4. Bi thc hnh tng hp ting ni
S dng phng php tng hp trc tip n gin
- S dng my tnh c nhn v phn mm Matlab (hoc cc ngn ng lp trnh khc) xy
dng mt h thng thng bo im xe but cng cng.
- S dng my tnh c nhn v phn mm Matlab (hoc cc ngn ng lp trnh khc) xy
dng mt h thng thng bo s th t khch hng n lt c phc v ti mt im giao
dch ngn hng.
Chng 5: Nhn dng ting ni
57
Chng 5: Nhn dng ting ni
5.1. M u
Nhu cu v nhng thit b (my) c th nhn bit v hiu c ting ni c ni bi bt
k ai, trong bt k mi trng no tr thnh mt c mun tut bc ca con ngi cng
nh cc nh nghin cu v cc d n nghin cu v nhn dng ting ni trong sut gn mt
th k qua. Cho n nay, mc d chng ta t c nhng bc tin di trong vic hiu
c qu trnh to tn hiu ting ni v a ra nhiu k thut phn tch ting ni, v thm ch
chng ta t c nhiu tin b trong vic xy dng v pht trin nhiu h thng nhn
dng tn hiu ting ni quan trng, chng ta vn cn ang qu xa mc tiu t ra l c th
xy dng c nhng c my c th giao tip mt cch t nhin vi con ngi. Trong
chng ny, trc ht chng ta s xem xt li lch s pht trin ca lnh vc nghin cu nhn
dng ting ni, sau tm hiu s b mt h thng nhn dng tn hiu ting ni tng qut v
mt s phng php hin ang c s dng trong cc h thng nhn dng tn hiu ting
ni cng vi u nhc im ca n.
5.2. Lch s pht trin cc h thng nhn dng ting ni
Nghin cu v nhn dng ting ni l mt lnh vc nghin cu v ang din ra c
gn mt th k. Trong sut qu trnh , chng ta c th phn loi cc cng ngh nhn dng
thnh cc th h nh sau:
Th h 1: Th h ny c nh mc bt u t nhng nm 30 cho n nhng nm 50.
Cng ngh ca th h ny l cc phng thc ad hoc nhn dng cc m, hoc cc b t
vng vi s lng nh ca cc t tch bit.
Th h 2: Th h th hai bt u t nhng nm 50 v kt thc nhng nm 60. Cng ngh
ca th h ny s dng cc cc phng php acoustic-phonetic nhn dng cc phonemes,
cc m tit hoc cc t vng ca cc s.
Th h 3: Th h ny s dng cc bin php nhn dng mu nhn dng tn hiu ting
ni vi cc b t vng va v nh ca cc t tch bit hoc dy t c lin kt vi nhau, bao
gm c vic s dng b LPC nh l mt phng php phn tch c bn; s dng cc o lng
khong cch LPC cho im s tng ng ca cc mu; s dng cc gii php lp trnh
ng cho vic chnh thi gian; s dng nhn dng mu cho vic phn hoch cc mu thnh
cc mu tham chiu nht qun, s dng phng php m ha lng t ha vc-t gim
nh d liu v tnh ton. Th h th ba bt u t nhng nm 60 n nhng nm 80.
Th h 4: Th h th t bt u t nhng nm 80 n nhng nm 00. Cng ngh ca th
h ny s dng cc phng php thng k vi m hnh Markov n (HMM) cho vic m
phng tnh cht ng v thng k ca tn hiu ting ni trong mt h thng nhn dng lin tc;
s dng cc phng php hun luyn lan truyn xui-ngc v phn on K-trung bnh
(segmental K-mean); s dng phng php chnh thi gian Viterbi; s dng thut ton
tng ng ti da (ML) v nhiu tiu chun cht lng cng cc gii php ti u ha cc
m hnh thng k; s dng mng n-ron c lng cc hm mt xc sut c iu kin;
s dng cc thut ton thch nghi thay i cc tham s gn vi hoc tn hiu ting ni hoc
vi m hnh thng k nng cao tnh tng thch gia m hnh v d liu nhm tng tnh
chnh xc ca php nhn dng.
Chng 5: Nhn dng ting ni
58
Th h 5: Chng ta ang chng kin s pht trin ca lp cng ngh nhn dng ting ni
th h th nm. Cng ngh th h ny s dng cc gii php x l song song tng tnh tn
cy trong cc quyt nh nhn dng; kt hp gia HMM v cc phng php acoustic-
phonetic pht hin v sa cha nhng ngoi l ngn ng; tng tnh chc chn (chn chn -
robustness) ca h thng nhn dng trong mi trng c nhiu; s dng phng php hc
my xy dng cc kt hp ti u ca cc m hnh.
Cng cn ch rng, vic phn chia cc giai on ch mang tnh tng i v mc thi
gian. iu ny d hiu bi v cc th h cng ngh khng phn tch rch ri nhau m hu nh
cc tng ct li ca mi giai on li c thai nghn t giai on trc . Cc giai on
c phn chia ch nhm ch ra rng trong giai on nhiu kt qu nghin cu lin quan
n cng ngh ca giai on oc a ra v tr thnh tiu chun cho hu ht cc h thng
nhn dng ca thi k .
5.3. Phn loi cc h thng nhn dng ting ni
Ty theo cc cch nhn m chng ta cc cch phn loi cc h thng nhn dng ting ni
khc nhau. Xt theo kha cnh n v ting ni c s dng trong cc h thng, th cc h
thng nhn dng ting ni c th c phn thnh hai loi chnh. Loi th nht l cc h
thng nhn dng t ring l, trong cc biu din t phn tch n l c nhn dng. Loi
th hai l cc h thng nhn dng lin tc trong cc cu lin tc c nhn dng. H thng
nhn dng ting ni lin tc cn c th chia thnh lp nhn dng vi mc ch ghi chp
(transcription) v lp vi mc ch hiu tn hiu ting ni. Lp vi mc nh ghi chp c mc
tiu nhn dng mi t mt cch chnh xc. Lp vi mc ch hiu, cng cn c gi l lp
nhn dng ting ni hi thoi, tp trung vo vic hiu ngha ca cc cu thay v vic nhn
dng cc t ring bit. Trong cc h thng nhn dng ting ni lin tc, iu quan trng l
phi s dng cc kin thc ngn ng phc tp. Chng hn nh vic ng dng cc lut v ng
php, cc lut quy nh v vic t chc dy cc t trong cu, l mt v d.
Theo cch nhn khc, cc h thng nhn dng ting ni c th c phn chia thnh cc h
thng nhn dng khng ph thuc vo ngi ni (speaker-independent) v h thng nhn
dng ph thuc vo ngi ni (speaker-dependent). H thng nhn dng c lp vi ngi
ni c kh nng nhn dng ting ni ca bt c ai. Trong khi , i vi h thng nhn dng
ph thuc ngi ni, cc mu/m hnh tham kho cn phi thay i cp nht mi ln ngi
ni thay i. Mc d vic nhn dng c lp vi ngi ni kh hn rt nhiu so vi vic nhn
dng ph thuc ngi ni, nhng vic pht trin cc phng nhn dng c lp l c bit
quan trng nhm m rng phm vi s dng ca cc h thng nhn dng.
Ngoi ra, cc h thng ting ni cng c th phn chia lm cc nhm sau: cc h thng
nhn dng ting ni t ng, cc h thng nhn dng ting ni lin tc, v cc h thng x l
ngn ng t nhin (NLP - Natural Language Processing). Cc h thng nhn dng ting ni t
ng, nh tn m t, l cc h thng nhn dng m khng cn thng tin u vo ca ngi s
dng b sung vo. Cc h thng nhn dng ting ni lin tc, nh cp phn trn, l
cc h thng c kh nng nhn dng cc cu lin tc. Ni cch khc, v mt l thuyt, cc h
thng loi ny khng yu cu ngi s dng (ngi ni) phi ngng trong khi ni. Cc h
thng x l ngn ng t nhin c ng dng khng ch trong cc h thng nhn dng ting ni.
Cc h thng s dng cc phng php tnh ton cn thit cho cc my c th hiu c
ngha ca ting ni ang c ni thay v ch n gin bit c t no c ni.
Chng 5: Nhn dng ting ni
59
Mt cch tng qut, Victo Zue v ng nghip nh ngha mt s tham s v dng n
phn chia cc h thng nhn dng theo cc tham s nh trnh by trong bng 5.1.
Tham s Phn loi in hnh
n v ting ni Ri rc (cc t n l) Lin tc (cc cu lin tc)
Hun luyn Hun luyn trc khi s dng - Hun luyn lin tc
Ngi s dng Ph thuc - c lp
T vng S lng nh - S lng ln
SNR Thp Cao
B chuyn i Hn ch - Khng hn ch
Bng 5.1: Cc tham s v phn loi h thng nhn dng tng ng
5.4. Cu trc h nhn dng ting ni
Hnh 5.1 l cu trc nguyn l ca mt h thng nhn dng ting ni. Tn hiu ting ni
trc ht c x l bng cch p dng mt trong cc phng php phn tch ph ngn hn
hay cn c gi l qu trnh trch chn c trng hoc qu trnh tin x l (front-end
processing). Kt qu thu c sau qu trnh trch chn c trng l tp cc c trng m hc
(acoustic features) c to dng thnh mt vc-t. Thng thng khong 100 vc-t c
trng m hc c to ra ti u ra ca qu trnh phn tch trong mt n v thi gian mt
giy.
Hnh 5.1 Cu trc tng qut ca mt h thng nhn dng ting ni
Vic so snh (matching) trc ht thc hin bng vic hun luyn xy dng cc c trng,
sau s dng so snh vi cc tham s u vo thc hin vic nhn dng. Trong qu
trnh hun luyn h thng dng vc-t cc c trng c a vo h thng c lng cc
tham s ca cc mu tham kho (reference patterns). Mt mu tham kho c th m phng
(model) mt t, mt m n (a single phoneme) hoc mt n v ting ni no (some other
speech unit). Ty thuc vo nhim v ca h thng nhn dng, qu trnh hun luyn h thng
s bao gm mt qu trnh x l nhiu t phc tp. Chng hn vi h thng nhn dng ph
thuc ngi ni (speaker dependent recognition), c th ch bao gm mt vi hoc duy nht
Chng 5: Nhn dng ting ni
60
mt biu din (utterances) cho mi t cn c hun luyn. Tuy nhin, i vi h thng nhn
dng c lp vi ngi ni, c th bao gm hng ngn biu din tng ng vi tn hiu ca
mu tham kho mong mun. Nhng biu din ny thng l b phn (part) ca mt c s d
liu ting ni c thu thp trc y. Cn ch rng vic trch chn cc c trng tiu
biu (representative features) v xy dng mt m hnh tham kho (a reference model) l mt
qu trnh tn thi gian v l mt cng vic phc tp.
Trong qu trnh nhn dng, dy cc vc-t c trng c em so snh vi cc mu tham
kho. Sau , h thng tnh ton tng ng (likelihood - ging nhau) ca dy vc-t
c trng v mu tham kho hoc chui mu tham kho. Vic tnh ton ging nhau thng
c tnh ton bng cch p dng cc thut ton hiu qu chng hn nh thut ton Viterbi.
Mu hoc dy mu c tng ng (likelihood) cao nht c cho l kt qu ca qu trnh
nhn dng.
Hin nay, cc phng php trch chn c trng ph bin thng l cc mch lc Mel
(Mel filterbank) kt hp vi cc bin i ph Mel sang min cepstral. Chng ta s tm hiu s
tin x l c tiu chun ha nh mt phng php tin x l bi ETSI. M hnh mu
tham chiu thng l cc m hnh Markov n (HMMs).
5.5. Cc phng php phn tch cho nhn dng ting ni
5.5.1 Lng t ha vc-t
Chng ta thy rng, kt qu ca cc php phn tch trch chn tham s l dy cc vc-t
c trng ca c tnh ph thay i theo thi gian ca tn hiu ting ni. thun tin, chng
ta k hiu cc vc-t ph l v
l
, l=1,2,, L, trong mi vc-t thng l mt vc-t c chiu
di p. Nu chng ta so snh tc thng tin ca cc biu din vc-t v cc biu din trc
tip dng sng tn hiu (uncoded speech waveform), chng ta thy rng cc phn tch ph cho
php chng ta gim nh i rt nhiu tc thng tin yu cu. Ly v d, vi tn hiu ting ni
c ly mu vi tn s ly mu 10kHz, v s dng 16bt biu din bin ca mi mu.
Khi biu din raw cn 160000bps lu tr cc mu tn hiu. Trong khi , i vi phn
tch ph, gi s chng ta s dng cc vc-t c di p=10 v s dng 100 vc-t ph trong
mt n v thi gian mt giy. V chng ta cng s dng chnh xc 16 bt biu din mi
thnh phn ph, khi chng ta cn 100x10x16bps hay 16000bps lu tr. Nh vy
phng php phn tch ph cho php gim i 10 ln. T l gim ny l cc k quan trng
trong vic lu tr. Da trn khi nim cn ti thiu ch mt biu din ph n l cho mi n
v ting ni, chng ta c th lm gim nh thm na cc biu din ph raw ca tn hiu thnh
cc thnh phn t mt tp nh hu hn cc vc-t ph duy nht m mi thnh phn tng ng
vi mt n v c bn ca tn hiu ting ni (tc l cc phoneme). L tt nhin, mt biu din
l tng l kh c th t c trong thc t bi v c qu nhiu cc bin s trong cc tnh
cht ph ca mi mt n v tn hiu ting ni c bn. Tuy nhin, khi nim v vic xy dng
mt b m (codebook) gm cc vc-t phn tch phn bit, mc d c s t m nhiu hn tp
c bn cc phoneme, vn l mt tng hp dn v l tng c bn nm trong mt lot cc
k thut phn tch c gi chung l cc phng php lng t ha vc-t. Da trn cc suy
lun trn, gi s chng ta cn mt b m vi khong 1024 vc-t ph c nht (tc l khong
25 dng khc nhau ca mi tp 40 n v tn hiu ting ni c bn). Nh th, biu din mt
vc-t ph bt k, tt c chng ta cn l mt s 10 bt - khi ch s ca vc-t b m ph
hp nht vi vc-t vo. Gi s rng tc 100 vc-t ph trong mt n v thi gian mt
Chng 5: Nhn dng ting ni
61
giy, chng ta cn tng tc bt vo khong 1000bps biu din cc vc-t ph ca tn
hiu. Ta thy rng, tc ny ch bng khong 1/16 tc cn thit ca cc vc-t ph lin
tc. Do , phng php biu din lng t ha vc-t l mt phng php c kh nng biu
din cc k hiu qu cc thng tin ph ca tn hiu ting ni.
Trc khi tho lun cc khi nim lin quan n vic thit k v thc hin mt h lng t
vc-t thc t, chng ta im li cc u im v nhc im ca phng php ny. Trc ht,
cc u im chnh ca phng php biu din lng t vc-t bao gm:
Cho php gim nh vic lu tr thng tin phn tch ph tn hiu. iu ny cho php to
thun li cho vic p dng trong cc h thng nhn dng tn hiu ting ni thc t.
Cho php gim nh vic tnh ton xc nh s ging nhau (tng ng - similarity) ca
cc vc-t phn tch ph. Chng ta bit rng, trong php nhn dng tn hiu ting ni, mt
bc quan trng trong vic tnh ton l quyt nh tng ng ph ca mt cp vc-t. Da
trn biu din lng t ha vc-t, vic tnh ton tnh tng ng ph tn hiu thng c
gim xung thnh mt php tra bng ca s ging nhau gia cc cp vc-t m.
Cho php biu din ri rc tn hiu m thanh ting ni. Bng vic gn mt nhn phonetic
(hoc c th l mt tp cc nhn phonetic hoc mt lp phonetic) vi mt vc-t m, qu
trnh chn ra mt vc-t m biu din mt vc-t ph cho trc ph hp nht tr thnh vic
gn mt nhn phonetic cho mi khung ph ca tn hiu. Mt lot cc h thng nhn dng
ting ni tn ti s dng nhng nhn ny cho php nhn dng mt cch hiu qu.
Tuy vy cng phi k n mt s hn ch ca vic s dng b m lng t ha vc-t
biu din cc vc-t ph tn hiu ting ni. Chng bao gm:
Tn ti s mo ph k tha (inherent) trong vic biu din vc-t phn tch thc t. Do ch
c s lng hu hn vc-t m, qu trnh chn vc-t thch hp nht biu din mt vc-t ph
cho trc tng t nh qu trnh lng t mt vc-t v kt qu l dn n mt sai s lng
t no . Sai s lng t gim khi s lng cc vc-t m tng. Tuy nhin, vi mi b m c
s vc-t m hu hn th lun tn ti mt mc sai s lng t.
Dung lng lu tr cho cc vc-t m thng l khng bt thng (nontrivial). Nu b m
cng ln, ngha l cng gim nh sai s lng t, th dung lng lu tr cc thnh phn b
vc-t m yu cu cng cao. Vi cc b m c kch thc ln hn hoc bng 1000, th dung
lng lu tr thng l khng bt thng. Nh vy c mt s mu thun gia sai s lng t,
qu trnh la chn vc-t m, v dung lng lu tr cc vc-t m. Trong cc thit k ng
dng thc t cn phi cn bng ba yu t ny.
a) S thc hin lng t ha vc-t
S khi ca cu trc phn loi (classification) v hun luyn s dng lng t ha vc-
t c bn c trnh by trong hnh 5.2. Mt tp ln cc vc-t phn tch ph v
1
, v
2
, , v
L
to
thnh tp cc vc-t dng hun luyn. Tp cc vc-t ny dng to ta mt tp ti u cc
vc-t m biu din cc bin ph quan st c trong tp hun luyn. Nu chng ta k
hiu kch c ca b m lng t ha vc-t l M=2
B
(chng ta gi y l mt b m B-bt),
khi chng ta cn c L>> M c th tm c mt tp gm M vc-t ph hp nht. Trong
thc t, ngi ta thy rng, qu trnh hun luyn b m lng t vc-t hot ng tt, L
thng phi ti thiu bng 10M. Tip n l qu trnh o lng ging nhau hay cn gi l
khong cch gia cc cp vc-t phn tch ph nhm c th phn hoch (cluster) tp cc
Chng 5: Nhn dng ting ni
62
vc-t hun luyn cng nh gn hoc phn loi cc vc-t ph thnh cc thnh phn ca b
m duy nht. Khong cch ph gia hai vc-t ph v
i
v v
j
c k hiu l d
ij
=d(v
i
, v
j
). Qu
trnh tip tc phn loi tp L vc-t hun luyn thnh M phn hoch v chng ta chn M vc-
t m nh l tp trung tm (centroid) ca mi mt phn hoch . Th tc phn loi cc vc-
t phn tch ph tn hiu ting ni xc nh thc hin vic chn vc-t m gn nht vi vc-t
nhp vo v s dng ch s m nh l kt qu biu din ph. Qu trnh ny thng c gi
l vic tm kim ln cn gn nht hoc th tc m ha ti u. Th tc phn loi v c bn l
mt b lng t ha vi u vo l mt vc-t ph tn hiu ting ni v u ra l ch s m
ha ca mt vc-t m m gn ging vi u vo nht (best match)
Hnh 5.2 M hnh s dng vc-t lng t hun luyn v phn loi
b) Tp hun luyn b lng t ha vc-t
c th hun luyn b m lng t ha vc-t mt cch chnh xc, cc vc-t thuc tp
hun luyn phi bao ph (span) cc kha cnh mong mun nh sau:
Ngi ni, bao gm cc nhm (ranges) v tui tc, trng m (accent), gii tnh, tc ni,
cc mc v cc bin s khc.
Cc iu mi trng chng hn nh phng yn lng hay trn -t (automobile), hoc khu
lm vic n o (noisy workstation).
Cc b chuyn i (transducers) v cc h thng truyn dn, bao gm c cc mi-c-r bng
thng rng, cc ng nghe (handset) in thoi (vi cc mi-c-r cc-bon v in than), cc
truyn dn trc tip, knh tn hiu in thoi, knh bng thng rng, v cc thit b khc.
Cc n v ting ni bao gm cc t vng s dng nhn dng c bit (chng hn cc ch
s) v ting ni lin tc (conversational speech)
Mc tiu hun luyn cng hp cng r rng (chng hn vi s lng ngi ni hn ch,
ting ni trong phng yn lng, ...) th sai s lng t khi s dng vic biu din ph tn hiu
vi b m kch thc c nh cng nh. Tuy nhin c th ng dng gii quyt nhiu loi
bi ton thc t, tp hun luyn phi cng ln cng tt.
c) o lng s tng ng hay khong cch
Khong cch ph gia cc vc-t ph v
i
v v
j
c nh ngha nh sau:
( )
ij
0
,
0
i j
i j
i j
v v
d v v d
v v
=
= =
>
(5.1)
Chng 5: Nhn dng ting ni
63
d) Phn hoch cc vc-t hun luyn
Th tc phn hoch tp L vc-t hun luyn thnh mt tp gm M b vc-t m c th
c m t nh sau:
Bt u: Chn M vc-t bt k t tp L vc-t hun luyn to thnh mt tp khi u cc
t m ca b m.
Tm kim ln cn gn nht: Vi mi vc-t hun luyn, tm mt vc-t m trong b ang
xt gn nht (theo ngha khong cch ph) v gn vc-t vo tng ng.
Cp nht centroid: Cp nht t m trong mi bng cch s dng centroid ca cc vc-t
hun luyn trong cc .
Lp: Lp li cc bc 2 v 3 cho n khi khong cch trung bnh nh hn mt khong
ngng nh sn.
e) Th tc phn loi vc-t
Vic phn loi cc vc-t i vi cc vc-t ph bt k v c bn l vic tm ht trong b
m tm ra c mt vc-t tng ng nht. Chng ta k hiu b vc-t m ca mt b m
M vc-t l y
m
, (1 m M) v vc-t ph cn phn loi (v lng t ha) l v, khi ch s
m
*
ca t m ph hp nht c xc nh nh sau:
( )
*
1
arg min ,
m
m M
m d v y
= (5.2)
Vi cc b m c gi tr M ln (chng hn M 1024), vic tnh ton theo cng thc (5.2)
s tr ln phc tp (be excessive), v ph thuc vo tnh ton chi tit ca qu trnh o lng
khong cch ph. Trong thc t, ngi ta thng s dng cc thut gii cn ti u (sub-
optimal) tm kim.
5.5.2 B x l LPC trong nhn dng ting ni
Trong phn trc chng ta tho lun v cc tnh cht chung nht ca phng php phn
tch LPC. Trong phn ny chng ta s m t chi tit vic s dng b x l LPC cho cc h
thng nhn dng tn hiu ting ni. S khi ca khi x l LPC c trnh by trong hnh
5.3. Cc bc c bn trong qu trnh x l ca b x l nh sau:
Hnh 5.3 S khi b x l LPC trong nhn dng ting ni
Chng 5: Nhn dng ting ni
64
a) Tin nhn tn hiu
u tin tn hiu ting ni dng s ha s(n) c a qua mt h thng lc s bc thp,
thng l b lc p ng xung hu hn (FIR) bc nht, nhm lm phng ph tn hiu. iu
ny s gip cho tn hiu t b nh hng ca cc php bin i x l tn hiu c chnh xc
hu hn trong sut qu trnh sau . B lc s s dng cho vic tin nhn tn hiu c th l
mt b lc vi cc tham s c nh hoc c th l mt b lc thch nghi c cc tham s thay
i chm. Trong x l tn hiu ting ni, ngi ta thng dng mt h thng mch lc bc
nht c cc tham s c nh c dng:
( ) ( )
1
1 0, 9 1, 0 H z az a
= (5.3)
Khi , tn hiu u ra ca b tin nhn ( ) s n c th tnh nh sau:
( ) ( ) ( ) 1 s n s n as n = (5.4)
Gi tr ph bin ca h s c nh a l khong 0,95 (trong cc ng dng thc thi vi du
phy tnh gi tr ca a thng c chn l 15/16=0.9375). Hnh 5.4 biu din bin c
tnh hm truyn t
( )
j
H e
vi gi tr 0, 95 a = . T hnh v, chng ta c th quan st thy
rng ti = , tc l bng mt na tc ly mu, c s gia tng (boost) bin khong
32dB so vi bin tn s 0 = .
Hnh 5.4 Ph bin ca mch tin nhn tn hiu
Trong trng hp mch lc thch nghi c s dng, hm truyn t ca n thng c
dng:
( )
1
1
n
H z a z
= (5.5)
Trong
n
a thay i theo thi gian n theo mt tiu ch thch nghi c thit k trc. Mt
gi tr in hnh thng c s dng l ( ) ( ) 1 / 0
n n
a r r = .
b) Phn khung tn hiu
Kt qu tn hiu sau khi tin nhn tn hiu l mt khung tn hiu ( ) s n gm cc khung c
N mu, trong cc khung cnh nhau cch bit nhau M mu. Hnh 5.5 m t cc khung tn
hiu trong trng hp M=N/3. Ta thy, khung th nht gm N mu, khung th hai bt u
sau khung th nht M mu v c chung N-M mu vi khung th nht. Tng t nh vy,
khung th 3 bt u sau khung th nht 2M mu hay bt u sau khung th hai M mu v c
chung vi khung th nht v th hai tng ng l N-2M v N-M mu. Qu trnh ny c
tip tc cho n khi ton b tn hiu ca mt hoc mt s khung c phn khung xong. D
dng thy rng, nu MN th cc khung cnh nhau s c s bao trm ln nhau, v kt qu l
Chng 5: Nhn dng ting ni
65
cc c lng ph ca LPC s c s tng quan gia cc khung; nu M< <N th cc c
lng ph LPC gia cc khung s tng i trn tru (smooth). Mt khc, nu M>N, khi s
khng c s bao trm ln nhau gia cc khung; trong thc t khi mt phn tn hiu s b
mt hon ton (tc l khng xut hin trong bt c mt khung phn tch no), v khi tnh
tng h gia cc c lng ph LPC thu c ca cc khung cnh nhau s cha mt thnh
phn nhiu m bin ca n tng khi M tng (tc l khi s lng mu tn hiu b b qua
cng nhiu). y l trng hp khng th chp nhn c (intolerable) trong bt c php
phn tch LPC no s dng cho h thng nhn dng tn hiu ting ni. Gi khung tn hiu th
l l ( )
l
x n v gi s c ton b L khung tn hiu, khi :
( ) ( ) 0,1,..., 1; 0,1,..., 1
l
x n s Ml n n N l L = + = = (5.6)
iu ny c ngha l khung tn hiu u tin ( )
0
x n bao gm cc mu ( ) 0 s , ( ) 1 s , ,
( ) 1 s L ; khung tn hiu th hai ( )
l
x n bao gm cc mu ( ) s M , ( ) 1 s M + , , ( ) 1 s M N + ;
v khung tn hiu th L bao gm cc mu ( ) ( )
1 s M L , ( ) ( )
1 1 s M L + , ,
( ) ( )
1 1 s M L N + . i vi tn hiu ting ni c tc ly mu 6.67kHz th gi tr ca N v
M thng c chn tng ng l 300 v 100, ngha l tng ng vi cc khung 45 mili-giy
v khong cch gia cc khung l 15mili-giy.
Hnh 5.5 Phn khung tn hiu trong phn tch LPC cho nhn dng ting ni
c) Ly ca s tn hiu
Bc tip theo trong qu trnh x l phn tch LPC l vic ly ca s ca cc khung tn
hiu ring r nhm mc ch gim nh s khng lin tc ca tn hiu phn u v cui mi
khung. iu ny cng tng t nh cp trong phn gii thiu chung khi xem xt trong
min tn s: vic ly ca s tn hiu nhm mc ch ct b tn hiu v 0 phn bt u v kt
thc ca mi khung. Gi s hm ca s c nh ngha l w(n) (0nN-1), khi kt qu tn
hiu thu c sau khi ly ca s l:
( ) ( ) ( ) w 0 1
l l
x n x n n n N = (5.7)
Hm ca s ph bin dng cho phng php t tng quan trong LPC s dng trong cc
h thng nhn dng ting ni l hm ca s Hamming, trong biu thc hm c cho bi:
( )
2
w 0, 54 0, 46 os 0 1
1
n
n c n N
N
| |
=
|
\
(5.8)
d) Phn tch tnh t tng quan
Kt qu t tng quan ca mi khung tn hiu sau php ly ca s l:
Chng 5: Nhn dng ting ni
66
( ) ( ) ( )
1
0
0,1,...,
N m
l l l
n
n x n x n m m p
=
= + =
(5.9)
Trong , gi tr t tng quan cao nht p l bc ca phn tch LPC. Thng thng, p
c chn t 8 n 16. Cn ch n mt li ch ph ca vic s dng phng php t tng
quan l thnh phn t tng quan bc 0, tc l ( ) 0
l
, chnh l nng lng ca khung th l .
Nng lng ca khung tn hiu l mt tham s quan trng trong cc h thng pht hin tn
hiu ting ni.
e) Phn tch LPC
Bc tip theo trong qu trnh phn tch l php phn tch LPC, trong mi khu ca p+1
tham s t tng quan c chuyn i thnh mt tp cc tham s LPC. Tp cc tham s
LPC c th l tp cc h s LPC, hoc tp cc h s phn nh, hoc cc h s t l log, hoc
cc h s cepstral, hoc bt c bin i mong mun no t cc tp nu trn. Vic thc hin
bin i ny thng c thc hin bng cch p dng phng php Durbin c din gii
nh sau. thun tin, chng ta tm b ch s l trong biu thc ( )
l
r m .
( )
( )
0
0
l
E = (5.10)
( )
( )
( )
( )
( )
1
1
1
1
{ }
1
L
i
l j l
j
i
i
i i j
k i p
E
(5.11)
( ) i
i i
k = (5.12)
( ) ( ) ( ) 1 1 i i i
j j i i j
k
= (5.13)
( )
( )
( ) 1 2
1
i i
i
E k E
= (5.14)
Trong cng thc tnh tng ca cng thc th hai trn, (5.11), chng ta b qua trng hp
i=1. H cc phng trnh trn dc gii theo phng php truy hi vi i=1,2,, p v k qu
cui cng thu c l:
( )
( ) 1
p
m m
a m p = (5.15)
oef m c
k R = (5.16)
1
log
1
m
m
m
k
g
k
| |
=
|
+
\
(5.17)
(5.15) l cc h s LPC, (5.16) l cc h s phn x, v (5.17) l l-ga-rt cc h s t l
din tch.
f) Chuyn i cc tham s LPC sang cc h s Cepstral
Mt tp tham s quan trng c th xy dng trc tip t tp cc tham s LPC l tp cc
h s cepstral LPC. Cng thc xc nh s dng php quy c cho nh sau:
( )
2
0
ln c = (5.18)
( )
1
1
1
m
m m k m k
k
k
c a c a m p
m
=
| |
= +
|
\
(5.19)
Chng 5: Nhn dng ting ni
67
( )
1
1
m
m k m k
k
k
c c a m p
m
=
| |
= >
|
\
(5.20)
y,
2
l li ca vic s dng m hnh LPC. Cc h s cepstral chnh l cc h
s tng ng ca bin i Fourier ca cc gi tr l-ga-rt ca bin ph. Tp cc h s
cepstral c chng minh rng l mt tp cc c trng ng tin cy v robust hn tp cc h
s LPC, hay tp cc h s phn x cng nh tp cc h s t l log din tch trong vic nhn
dng tn hiu ting ni. Thng mt biu din gm Q>p h s cepstral c s dng, trong
ph bin Q3p/2.
g) Ly trng cc tham s - Parameter Weighting
Trong cc h s cepstral, cc h s bc thp rt nhy cm vi dc (slope) ca ton
di ph, trong khi cc h s bc cao th li rt nhy cm vi nhiu. Chnh v l do ny, n
dng nh tr thnh mt tiu chun ca cc php x l l s dng ly trng s cc h s
cepstral bng mt hm ca s nhm gim nh cc nhy cm ni trn. Mt cch thng thng
cho vic thay i vic s dng mt ca s cepstral l xem xt biu din Fourier ca l-ga-rt
ph bin v cc o hm l-ga-rt ca ph bin . Ngha l:
( )
log
j j m
m
m
S e c e
=
=
(5.21)
( ) ( ) log
j j m
m
m
S e jm c e
(
=
(5.22)
Thnh phn vi phn ca l-ga-rit ph bin c mt tnh cht c bit l bt c dc
ph c nh no trong l-ga-rt bin ph s tr thnh mt hng s. Hn na, bt c thnh
phn nh ph no trong l-ga-rt bin ph, tc l cc formant, u c bo m gi
nguyn trong vi phn ca l-ga-rt bin ph. Do , bng vic nhn biu din vi phn ca
l-ga-rt bin ph vi -jm, chng ta thc hin vic thay i trng cc tham s. Kt qu
chng ta c:
( )
log
j j m
m
m
S e c e
(
=
(5.23)
Trong :
( )
m m
c c jm = (5.24)
c th t c tnh robustness cho cc gi tr m ln, tc l cc trng s nh gn
m=Q, v c th ct b c phn tnh ton v nh trong cng thc (5.23), chng ta cn phi
a ra mt dng tng qut hn i vi cc h s trng s:
w
m m m
c c = (5.25)
Mt php ly trng s thch hp chnh l mt b lc thng di (b lc trong min
cepstral) c dng:
( ) w 1 sin 1
2
m
Q m
m Q
Q
( | |
= +
( |
\
(5.26)
Hm tnh ton trng s cho cng thc (5.26) c kh nng ct b phn tnh ton v hn
v gii nhn (de-emphasizes) cc h s
m
c xung quan m=1 v m=Q.
Chng 5: Nhn dng ting ni
68
h) Cc o hm Cepstral
Cc biu din cepstral ca ph tn hiu ting ni l mt biu din thch hp cho php
c t c cc tnh cht ph cc b ca tn hiu trong mt khung tn hiu phn tch xc nh.
Tuy nhin c th tng cht lng ca cc biu din ny bng cc m rng cc phn tch bao
gm cc thng tin v o hm ca cepstral theo thi gian (the temporal cepstral derivative).
Thc t cho thy rng c cc o hm cp mt v cp hai u mang li kh nng lm gia tng
cht lng hot ng ca h thng nhn dng tn hiu ting ni. a khi nim thi gian
vo cc biu din cepstral, chng ta k hiu h s cepstral th m thi im t l ( )
m
c t . Trong
thc t, thi im ly mu t gn vi khung tn hiu phn tch ch khng phi l mt thi im
bt k. Vic tnh o hm cc h s cepstral theo thi gian c thc hin mt cc xp x nh
sau: o hm theo thi gian ca l-ga-rt bin ph c biu din chui Fourier tng ng:
( )
( )
log ,
m j j m
m
c t
S e t e
t t
=
(
=
(5.27)
Do , o hm cepstral theo thi gian cng s c xc nh mt cch tng t. V
( )
m
c t l mt biu din thi gian ri rc (trong t l ch s khung tn hiu), chng ta khng
th p dng trc tip cc vi phn cp mt v cp hai xp x vi cc o hm (v iu ny
dn n kt qu nhiu rt ln it is very noisy). Do , mt cc tnh ton hp l l xp x
( ) /
m
c t t bi mt a thc ni suy trc giao gn ng (an orthogonal polynomial fit), mt
c lng bnh phng ti thiu ca cc o hm (a least-squared estimate of the derivative),
trn ton khong ca s hu hn. Ngha l:
( )
( ) ( )
K
m
m m
k K
c t
c t kc t k
t
=
= +
(5.28)
Trong , l mt hng s chun ha thch hp v (2K+1) l s khung tn hiu m
trn chng ta thc hin vic tnh ton. Thng thng, gi tr ca K thng c ly bng 3
v thy rng gi tr ny thch hp cho vic tnh ton cc o hm cp mt. T th tc tnh ton
trn, vi mi khung tn hiu t, kt qu ca php phn tch LPC l mt vc-t gm Q h s
cepstral c k n trng v mt vc-t m rng ca Q thnh phn o hm theo thi
gian c k hiu l:
( ) ( ) ( ) ( ) ( ) ( ) ( )
1 2 1 2
' , ,..., , , ,...,
t Q Q
o c t c t c t c t c t c t = (5.29)
Trong cng thc (5.29), '
t
o l mt vc-t gm 2Q thnh phn v (.)' biu din php
chuyn v ma trn.
Mt cch tng t, nu chng ta thc hin vic tnh ton cc o hm cp hai ( )
2
m
c t
v thm cc gi tr ny vo vc-t
t
o ta s thu c mt vc-t mi gm 3Q thnh phn.
i) Bng cc gi tr ph bin ca cc tham s trong phn tch LPC
Trong cc phn tch tnh ton theo phng php phn tch LPC, chng ta thy rng cc
tnh ton ph thuc vo s lng cc tham s bin s bao gm: s mu trong khung tn hiu
phn tch N, s mu phn cch im bt u ca cc khung lin k M, bc ca phn tch LPC
p, kch c ca vc-t cepstral c xy dng Q, s lng khung K m trn cc o hm
theo thi gian ca cc h s cepstral c tnh ton. Mc d mi mt gi tr ca cc tham s
Chng 5: Nhn dng ting ni
69
va k thay i trn mt di rt ln ph thuc vo cc h thng c th, mt s gi tr ph bin
i vi ba tn s ly mu tng ng l 6,67kHz, 8kHz v 10kHz c cho trong bng sau.
Gi tr tham s F
s
=6,67kHz F
s
=8kHz F
s
=10kHz
N 300 (45ms) 240 (30ms) 300 (30ms)
M 100 (15ms) 80 (10ms) 100 (10ms)
p 8 10 10
Q 12 12 12
K 3 3 3
Bng 5.2: Mt s gi tr tham s ph bin ca php phn tch LPC
5.5.3 Phn tch MFCC trong nhn dng ting ni
S khi phng php phn tch cepstral tn s Mel (Mel frequency Cepstral analysis)
dng trch chn c trng tn hiu ting ni c trnh by trong hnh 5.6. y l mt k
thut ph bin i din cho lp phng php trch chn c trng c tn gi l MFCCs (Mel
frequency cepstral coefficients). u tin, tn hiu ting ni c lc bi mt mch lc thng
cao (high-pass filter) vi tn s ct (cut-off frequency) rt thp nhm loi b thnh phn tn
hiu mt chiu m c th do b chuyn i ADC to ra. c bit vic lc ny l cn thit
tng tnh chnh xc khi thc hin tnh ton nng lng tn hiu theo khung trong cc phn tch
ngn hn. Nng lng tn hiu cng nh cc tham s cepstral c tnh i vi mi khung
ca s dch vi khong dch d
shift
=10ms. Do vic cm nhn m thanh ca con ngi theo
thang khng tuyn tnh nn vic tnh nng lng tn hiu thng l dng thang l-ga-rt. Nng
lng khung theo l-ga-rt (logarithmic frame energy) c s dng nh mt thnh phn ca
vc-t c trng tn hiu. Sau mt mch lc thng cao khc c s dng tin nhn tn
hiu nhm mc ch tng cng cc thnh phn tn hiu vng tn cao vng m tn hiu c
xu th c nng lng thp. Ph tn hiu ngn hn c tnh sau bng cch nhn cc mu
ca khung tn hiu vi mt ca s Hamming v s dng php bin i Fourier nhanh (FFT).
n y ch c bin ph c ly ra bi v ph pha ngn hn khng cha cc thng tin c
ch ca tn hiu ting ni. Chng ta bit rng, h thng m thanh (auditory) ca con ngi tch
ly (accumulate) cc nng lng theo nhng di chnh (critical bands). Da vo c im ny,
h mch lc thang Mel (Mel-scale filterbank) c s dng. H mch lc ny gm 23 bng
con (subbands). Cc thnh phn FFT ph c nhn vi mt hm tam gic v c
accumulated vo mt vng tn s xc nh to thnh mt thnh phn ph Mel. B rng ca
cc di tn tng dn khi tn s tng theo quan h tuyn tnh v tn s Mel. Vi nng lng tn
hiu ngi ta tnh ton l-ga-rt ca cc ph Mel. Cc thnh phn tn Mel cnh nhau c tnh
tng quan cao (fairly correlated). trch chn cc thnh phn c trng tng i c tp
thng k vi nhau, ngi ta p dng php bin i Cosine ri rc (DCT) cho cc l-ga-rt ph
Mel. Cc c trng c lp thng k ny s to thun li cho vic m hnh cc c tnh ca tn
hiu ting ni trong cc m hnh tham chiu (reference models) v vic tnh ton cc tng
ng trong qu trnh so snh i chiu mu.
Chng 5: Nhn dng ting ni
70
Hnh 5.6 S khi qu trnh phn tch MFCC
Vi phng php tin x l theo tiu chun a ra bi ETSI th c 13 h s cepstral
c tnh ton bao gm c h s cepstral th 0. Ch rng h s cepstral th 0 biu din gi
tr trung bnh (mean) ca l-ga-rt ph Mel. Do , gi tr ny c quan h mt thit vi nng
lng khung. Thng th hoc l l-ga-rt nng lng khung c tnh t tn hiu thi gian
hoc l h s cepstral th 0 c s dng nh mt tham s trong qu trnh nhn dng tn hiu
ting ni. Cc vc-t c trng cho vic nhn dng ting ni thng bao gm l-ga-rt nng
lng khung v 12 h s cepstral C
1
n C
12
. p dng cc k thut thch ghi nhm nng
cao cht lng h thng nhn dng, chng ta cn thit bit tham s C
0
. V do C
0
thng
c trch ra mt cch c bit s dng cho qu trnh hun luyn, v C
0
tr thnh mt
tham s ca HMM. Ngha l mt tp cc h s cepstral trong cc mu tham chiu c th c
bin i ngc li thnh ph Mel. Tuy nhin cn ch rng thnh phn C
0
khng c s
dng cho qu trnh nhn dng mu.
Cc tham s m hc gii thiu phn trn c gi l cc tham s tnh v chng c
tnh t tn hiu ting ni cho mt khung ngn khong 25ms. Do , tng cht lng h
thng nhn dng, mt lot cc tham s ng cn c quan tm. iu ny c th c hin
thc bng vic quan st ng bin i (contour) ca mi tham s tnh theo thi gian v tnh
ton vi phn (derivative) ca cc ng dch chuyn ny. Cc tham s c tnh ton theo
Chng 5: Nhn dng ting ni
71
cch ny c gi l cc h s en-ta. Ta c vi phn bc nht ( )
i
C k ca h s cepstral
i
C
c tnh theo cng thc:
( )
( ) ( )
1
2
1
N
i i
j
i N
j
j C k j C k j
C k
j
=
=
+ (
=
(5.30)
H s N
y w y w (5.33)
Cc b nhn dng thng xp x cng thc ny bng php tnh cc i do cc
phng php pht m khc nhau c th c gii m nh th chng l cc gi thit t thay th.
Mi Q l mt dy cc pht m ca t
1
Q , ,
K
Q trong mi phng n pht m l mt dy
cc m v c s
( ) ( )
1 2
, ,...
k k
k
Q q q = . Khi chng ta c:
( ) ( )
1
| |
K
k k
k
p Q p Q w
=
=
w (5.34)
Chng 5: Nhn dng ting ni
86
y ( ) |
k k
p Q w l xc sut t
k
w c pht m da trn dy cc m v c s Q.
Trong thc t, ch c rt t s kh nng c th cc phng n pht m
k
Q cho mi t
k
w , iu
ny cho php tng (5.33) d dng kim sot c.
Hnh 5.19 M hnh m v c s da trn m hnh HMM
Mi m c s q c biu din bi mt m hnh Markov n mt lin tc (HMM)
c minh ha trong hnh 5.19. Trong minh ha ny, cc tham s dch chuyn l
ij
{ } a v cc
phn b quan st u ra ( ) { }
j
b . Cc phn b quan st u ra thng l s pha trn ca cc
phn b chun Gausse:
( )
1
; ,
M
j jm jm
m jm
b y c y
=
| |
=
|
\
(5.35)
biu din phn b chun vi gi tr trung bnh
jm
v covariance
jm
. S lng
cc thnh phn trong cng thc (5.35) thng ly trong khong 10 n 20. V kch thc ca
cc vc-t m y thng tng i ln, cc covariance thng c gii hn l cc ma trn
ng cho. Cc trng thi u v kt thc l cc trng thi khng pht x (nonemitting) v
chng c thm vo nhm n gin ha qu trnh chp ni cc m hnh m v to ra cc
t.
Cho trc mt HMM tng hp vi Q c to ra bng cc chp ni tt c cc m v c
s cu thnh, tng ng m c tnh bi:
( ) ( ) | , |
X
p y Q p x y Q =
(5.36)
Trong ( ) ( ) 0 ,..., X x x T = l mt dy cc trng thi trong ton b m hnh tng hp
v
( )
( ) ( ) ( ) ( ) ( ) 0 , 1 , 1
1
, |
T
x x x t x t x t
t
p x y Q a b a
+
=
=
(5.37)
Cc tham s m hnh m
ij
{ } a v ( ) { }
j
b c th c c lng mt cch hiu qu t
tp cc b hun luyn bng phng php cc i k vng.
Chng 5: Nhn dng ting ni
87
5.7. Bi thc hnh nhn dng ting ni
S dng my tnh c nhn v phn mm Matlab (hoc cc ngn ng lp trnh khc) thc
hin cc cng vic sau:
- Xy dng h thng nhn dng ting ni n gin (t vng hn ch) da vo:
o Mng n-ron
o M hnh HMM
Ph lc 1: Mng n-ron
88
Ph lc 1: Mng n-ron
M u
Hot ng nghin cu v c ch hot ng, cu trc b no con ngi c ch kh sm.
Cng vi s pht trin ca khoa hc, chng ta t c mt s bc tin quan trng trong
lnh vc nghin cu ny. Tuy nhin, b no con ngi l mt t hp rt phc tp v cho n
nay hiu bit ca con ngi v kin trc v hot ng ca no vn cn cha y . Mc d
vy con ngi ta to ra c cc my c mt s tnh nng tng t no nh m phng cc
c im:
- Tri thc thu nhn c nh qu trnh hc
- Tnh nng c c nh kin trc mng v tnh cht kt ni
Cc my m phng ny c tn chung l mng n-ron nhn to hay n gin l mng nron.
c im chnh ca cc mng n-ron:
- Phi tuyn. Cho php x l phi tuyn.
- C ch nh x u vo - u ra cho php hc c gim st.
- C ch thch nghi. Thay i tham s ph hp vi mi trng.
- p ng theo mu hun luyn.
- Thng tin theo ng cnh.Tri thc c biu din tu theo trng thi v kin trc ca
mng.
- Cho php c li (fault tolerance).
- Phng sinh hc
C s mng v N-ron
S mt mng n-ron n gin c minh ha trong hnh A.1. Gi s c N u vo
c nh nhn
1
x ,
2
x , ,
N
x vi cc trng s tng ng l
1
w ,
2
w , , w
N
. Khi quan
h phi tuyn u vo u ra c xc nh nh sau:
1
w
N
i i
i
y f x
=
| |
=
|
\
Trong l mc ngng ni ti hay cn gi l offset, ( ) . f l mt hm phi tuyn.
Hnh A.1: Cu trc n gin ca mt mng n-ron N u vo
Mt s dng ph bin ca f c th c dng nh sau:
1. Hm ngng cng:
( )
1 0
1 0
x
f x
x
+
=
<
2. Hm log-sin
Ph lc 1: Mng n-ron
89
( ) ( )
1
0
1
x
f x
e
= >
+
Cu hnh mng N-ron
Mt yu t quan trng cho vic thit lp v ng dng ca mng n-ron l cu trc t-
p ca mng (network topology). C ba kiu cu trc c bn l:
1) Mng mt tng hoc nhiu tng:
(a)
(b)
Hnh A.2: Cu trc mng n-ron mt tng (a) v hai tng (b)
2) Mng hi quy:
Hnh A.3: Cu trc mng n-ron hi quy
3) Mng t t chc:
Hnh A.4: Cu trc mng n-ron t t chc (SOM) 3x3
Ph lc 2: M hnh Markov n
90
Ph lc 2: M hnh Markov n
Qu trnh Markov
Mt qu trnh ngu nhin X(t) c gi l mt qu trnh Markov nu tng lai ca
mt qu trnh vi trng thi hin ti cho khng ph thuc vo qu kh ca qu trnh. Ni
mt cch khc, vi cc thi gian xc nh
1 2 1
...
k k
t t t t
+
< < < < th:
( ) ( ) ( )
( ) ( )
1 1 1 1
1 1 1
Pr | ,...,
Pr |
k k k k
k k k k
X t x X t x X t x
X t x X t x
+ +
+ +
= = = (
= = = (
Cc gi tr ca ( ) X t ti thi im t thng c gi l trng thi ca qu trnh ti thi
im t.
Chui Markov vi thi gian ri rc
Gi s
n
X l mt chui Markov vi gi tr nguyn v thi gian ri rc vi trng thi
bt u ti n=0 c hm phn b xc sut ri rc (pmf):
( ) [ ] ( )
0
0 Pr 0,1,...
j
p X j j = =
Khi , hm mt phn b xc sut ri rc hp ca n+1 gi tr u tin ca qu trnh
c tnh bng:
[ ]
[ ] [ ] [ ]
0 0
1 1 1 1 0 0 0 0
Pr ,...,
Pr | ... Pr | Pr
n n
n n n n
X i X i
X i X i X i X i X i
= =
= = = = = =
T cng thc trn chng ta thy, hm mt phn b xc sut ri rc hp ca mt dy
xc nh l tch ca xc sut ca trng thi khi u v cc xc sut ca cc dy con chuyn
i trng thi mt bc.
Gi s cc xc sut chuyn i trng thi mt bc l c nh v khng thay i theo
thi gian, ngha l:
[ ]
1 ij
Pr |
n n
X j X i a n
+
= = =
Khi
n
X c ni l c cc xc sut chuyn i ng nht. Khi xc sut phn b
ri rc hp cho
0
,...,
n
X X tr thnh:
[ ] ( )
1 0 1 0
0 0
Pr ,..., ... 0
n n
n n i i i i i
X i X i a a p
= = =
Nh vy,
n
X hon ton c xc nh bi hm mt phn b xc sut ri rc khi
u ( ) 0
i
p v ma trn cc xc sut chuyn mt bc P:
00 01 02
10 11 12
0 1 2
...
...
i i i
a a a
a a a
a a a
(
(
(
( =
(
(
(
P
Ph lc 2: M hnh Markov n
91
Pc gi l ma trn xc sut chuyn. Ch rng, tng ca mi hng ca P phi
bng 1.
Hnh B.1 minh ha s mt chui Markov ri rc vi 5 trng thi c gn nhn S
1
S
5
v cc xc sut chuyn tng ng l nhn cc nhnh
ij
a .
Hnh B.1: Minh ha mt chui Markov ri rc vi 5 trng thi
M hnh Markov n
Trong phn trn chng ta v d v m hnh Markov m mi trng thi tng ng vi
mt s kin (vt l) quan st c. Tuy nhin cc m hnh nh vy c ng dng hn ch
trong cc bi ton thc t. Do , m hnh c m rng bao gm c nhng trng hp vic
quan st l mt hm xc sut ca trng thi - tc l m hnh l mt qu trnh thng k chng
kp vi mt qu trnh thng k bn trong m khng quan st c (n su bn trong), nhng
c th ch quan st c thng qua mt tp cc qu trnh thng k khc, cc qu trnh m to
ra dy cc quan st c. M hnh nh vy c gi l m hnh Markov n (HMM).
minh ha, chng ta xt v d cc m hnh tung ng xu nh sau. Mt ngi thc
hin vic tung ng xu nhng khng ni cho chng ta bit anh ta lm chnh xc nhng g.
Anh ta ch thng bo cho chng ta kt qu ca mi ng xu lt. Nh vy, i vi chng ta,
mt lot cc th nghim tung ng xu c n du, m ch c dy quan st c v n l dy
cc kt qu chn v l. Vn t ra lm sao xy dng mt m hnh HMM thch hp m
hnh dy chn v l quan st c. Vn u tin l vic quyt nh cc trng thi no trong
m hnh tng ng vi v sau l quyt nh bao nhiu trng thi cn thit trong m hnh.
Hnh B.2 minh ha 3 trng hp v d. Trng hp th nht tng ng vi gi thit
ch mt ng xu khng cn c tung. M hnh trong trng hp ny l m hnh hai trng
thi trong mi trng thi tng ng vi mt mt ca ng xu. D thy rng, m hnh
Markov trong trng hp ny l quan st c Cng cn ch rng, chng ta c th s dng
Ph lc 2: M hnh Markov n
92
m hnh Markov mt trng thi trong trng thi tng ng vi mt ng xu khng cn n
l, v tham s cha bit l s khng cn ca ng xu.
Hnh B.2: Minh ha ba m hnh Markov c th i vi th nghim tung ng xu n
Trng hp th hai tng ng vi m hnh hai trng thi trong mi trng thi
tng ng vi mt ng xu khng cn khc nhau c tung. Mi trng thi c c trng
bi mt phn b xc sut ca mt chn v mt l, v cc chuyn i gia cc trng thi c
c trng bi mt ma trn chuyn trng thi.
Trng hp th ba tng ng vi th nghim s dng ba ng xu khng cn khc
nhau, v vic chn mt trong ba ng xu ny c da trn mt s kin xc sut.
Vi mt la chn mt trong ba trng hp trn gii thch dy mt chn v mt l
quan st c, cu hi t ra l m hnh no m phng tng ng nht vi cc quan st thc
t. Chng ta thy rng, m hnh trong trng hp mt ch c mt tham s cha bit, hay ni
cch khc, bc t do ch bng mt. Trong khi cc m hnh trng hp hai v ba c bc t
do tng ng l 4 v 9. Do , vi bc t do ln hn, m hnh HMM ln hn s dng nh
c kh nng hn trong vic m t mt dy cc th nghim tung xu so vi cc m hnh nh hn.
Tuy nhin cng cn ch , iu nhn xt trn l ng v mt l thuyt, trong thc t c mt s
hn ch mnh vi kch thc ca m hnh.
Mt HMM c c trng bi:
1) S cc trng thi trong m hnh N. Mc d cc trng thi l n, nhng vi mt
s ng dng thc t thng c mt s ngha vt l gn vi cc trng thi
hoc mt tp cc trng thi ca m hnh.
Ph lc 2: M hnh Markov n
93
2) S cc k hiu quan st phn bit vi mi trng thi, tc l kch thc b ch
ri rc.
3) Phn b xc sut chuyn trng thi P trong
ij 1
Pr |
n j n i
a X S X S
+
( = = =
,
( ) 1 , i j N . Trong trng hp c bit trong mt trng thi bt k c th
t n bt k trng thi no khc trong mt bc duy nht, chng ta c
ij
0 a >
vi mi i, j. Vi cc loi HMM khc, chng ta c
ij
0 a = cho mt hoc nhiu
hn mt cp (i,j).
4) Phn b xc sut k hiu quan st trng thi j, ( ) { }
j
B b k = , trong
( ) ( ) Pr |
j k t j
b k v t X S ( = =
, ( ) 1 , 1 j N k M .
5) Phn b trng thi khi u { }
i
= trong [ ]
1
Pr
i i
X S = = , ( ) 1 i N .
Vi cc gi tr ca N, M, P, B v cho trc, HMM c th c s dng nh mt b
to cho mt dy quan st
1 2
...
T
O OO O = (vi mi quan st
t
O l mt k hiu t tp v v T l
s cc quan st trong dy) nh sau:
1) Chn mt trng thi khi u
1 i
X S = theo phn b trng thi khi u .
2) t t=1.
3) Chn
t k
O v = theo phn b xc sut k hiu trng thi
i
S , tc l ( )
i
b k .
4) Chuyn sang trng thi mi
1 t j
X S
+
= theo phn b xc sut chuyn trng thi cho
trng thi
j
S , tc l
ij
a .
5) t t=t+1; tr li bc 3 nu t<T; nu khng kt thc qu trnh.
94
Ti liu tham kho
[1]. John R. Deller, John H. L. Hassen, and John G. Proakis, Discrete-Time Processing
of Speech Signals, Wiley-IEEE Press, 2000.
[2]. Editors: Rainer Martin, Ulrich Heuter and Christiane Antweiler, Advances in
Digital Speech Transmission, Wiley, 2008.
[3]. Lawrence Rabiner and Biing-Hwang Juang, Fundamentals of Speech Recognition,
Prentice-Hall, 1993.
[4]. Editors Jacob Benesty, M. Mohan Sondhi and Yiteng Huang, Handbook of Speech
Processing, Springer-Verlag Berlin, 2008.
[5]. Antonio M. Peinado and Jose C. Segura, Speech Recognition over Digital Channels:
Robustness and Standards, John Wiley \& Sons, 2006.
[6]. John Holmes and Wendy Holmes, Speech Synthesis and Recognition, second
edition, Taylor and Francis, 2001.
[7]. Paul Taylor, Text-to-Speech Synthesis, Cambridge University Press, 2009.
[8]. Lawrence R. Rabiner and Ronald W. Schafer, Introduction to Digital Speech
Processing, Now Publishers Inc., 2007.
[9]. Lawrence R. Rabiner and Ronald Schafer, Digital Processing of Speech Signals,
Prentice-Hall, 1978.
[10]. Sadaoki Furui, Digital Speech Processing, Synthesis, and Recognition, second
edition, Marcel Dekker Inc., 2001.
[11]. Lawrence R. Rabiner, A Tutorial on Hidden Markov Models and Selected
Applications in Speech Recognition, Proceeding of the IEEE, Vol.77, No.2, Feb. 1989,
pp.257-286.