You are on page 1of 100

TP ON BU CHNH VIN THNG VIT NAM

HC VIN CNG NGH BU CHNH VIN THNG


*******************************






BI GING
X L TING NI


BIN SON:
PHM VN S
L XUN THNH
















H NI - 2010

i
Li ni u
Ting ni l mt phng tin trao i thng tin tin ch vn c ca con ngi. c m v
nhng "my ni", "my hiu ting ni" khng ch xut hin t nhng cu truyn khoa hc
vin tng xa xa m n cn l ng lc thi thc ca nhiu nh nghin cu, nhm nghin
cu trn th gii. Hot ng nghin cu v x l ting ni tri qua gn mt th k cng vi
nhiu thnh tu to ln trong vic xy dng pht trin cc k thut cng ngh x l ting ni
t c. Tuy vy, vic c c mt "my ni" mang tnh t nhin (v ging iu, pht
m...) cng nh mt "my hiu ting ni" thc th vn cn kh xa vi. Xu th pht trin ca
cng ngh hi t th k 21 cng thi thc vic hon thin hn na cng ngh c th t
c mc tiu ca con ngi v lnh vc x l ting ni. Chnh v th, vic nm bt c cc
k thut c bn cng nh cc cng ngh tin tin cho vic x l ting ni l thc s cn thit
cho sinh vin chuyn ngnh X l Tn hiu v Truyn thng ni ring, sinh vin chuyn
ngnh K thut in - in t ni chung. Vi mc ch , bi ging mn hc X l ting
ni c bin son nhm trang b cho sinh vin cc khi nim c bn quan trng v cn thit
cng nh nhm gii thiu cho sinh vin cc cng ngh tin tin, xu th nghin cu v pht
trin ca lnh vc x l ting ni. Cun sch c chia lm 5 chng:
1. Mt s khi nim c bn.
2. Biu din s ca tn hiu ting ni.
3. Phn tch ting ni.
4. Tng hp ting ni.
5. Nhn dng ting ni.
Cc chng 1 v 2 do ging vin L Xun Thnh bin son, cc chng cn li do ging
vin Phm Vn S bin son. Trong thi gian gp rt hon thnh cun bi ging ny, mc d
vi s c gng n lc ht sc, nh do kinh nghim cn nhiu hn ch, nhm tc gi khng
trnh khi nhng sai st v nhm ln. Nhm tc gi chn thnh mong mun nhn c nhng
ng gp t ng nghip v cc em sinh vin hon thin hn trong phin bn sau.
Mi gp xin gi v: B mn L thuyt mch, Khoa K thut in t I, Hc vin Cng
ngh Bu chnh Vin thng, Km10 ng Nguyn Tri, H ng, H Ni hoc gi email v
a ch xulytiengnoi@gmail.com.
H Ni, ngy 02 thng 05 nm 2010
Nhm bin son






ii
Danh mc cc t vit tt
ADC Analog Digital Converter B chuyn i tng t - s
ADM Adaptive Delta Modulation iu ch Delta thch nghi
ADPCM Adaptive Differential PCM iu xung m vi sai thch nghi
CSR Continuous Speech Recognition Nhn dng ting ni lin tc
DCT Discrete Cosine Transform Bin i Cosine ri rc
DFT Discrete Fourier Transform Bin i Fourier ri rc
DM Delta Modulation iu ch Delta
DTFT Discrete Time FT Bin i Fourier vi thi gian ri rc
DPCM Differential PCM iu ch xung m vi sai
FFT Fast FT Bin i Fourier nhanh
FIR Finite Impulse Response B lc p ng hu hn
FT Fourier Transform Bin i Fourier
HMM Hidden Markov Model M hnh Markov n
IDFT Inverse Discrete FT Bin i Fourier ri rc ngc
IDTFT Inverse DTFT
Bin i Fourier vi thi gian ri rc
ngc
IFT Inverse FT Bin i Fourier ngc
LMS Least Mean Square Bnh phng trung bnh ti thiu
LPC Linear Predictive Coding M ha d on tuyn tnh
LTI Linear Time-Invariant
B lc tuyn tnh khng thay i theo
thi gian
MFCC Mel frequency cepstral coefficient Cc h s cepstral tn s Mel
NLP Natural Language Processing X l ngn ng t nhin
PAM Pulse Amplitude Modulation iu ch bin xung m
SNR Signal to Noise Ratio T s tn hiu trn nhiu
ST Short-time Transform Bin i ngn hn
STFT Short-time FT Bin i Fourier ngn hn
TDNN Time delay Neural Network Mng n-ron vi thi gian tr
TD-PSOLA Time-domain PSOLA
Phng php chng ln ng b pitch
trong min thi gian

iii
Mc lc
Li ni u ............................................................................................................................. i
Danh mc cc t vit tt .......................................................................................................ii
Mc lc ................................................................................................................................iii
Chng 1: Mt s khi nim c bn ................................................................................ 1
1.1. M u............................................................................................................ 1
1.1.1 Ngun gc ca ting ni ............................................................................. 1
1.1.2 Phn loi ting ni ...................................................................................... 1
1.2. Qu trnh to ting ni .................................................................................... 2
1.2.1 Cu to ca h thng cu m...................................................................... 2
1.2.2 Cu to ca h thng tip m...................................................................... 3
1.3. Cc c tnh c bn ca ting ni ................................................................... 6
1.3.1 Tn s c bn v ph tn............................................................................ 6
1.3.2 Biu din tn hiu ting ni ........................................................................ 6
Chng 2: Biu din s ca tn hiu ting ni ............................................................... 12
2.1. M u.......................................................................................................... 12
2.2. Ly mu tn hiu ting ni............................................................................ 13
2.3. Lng t ha ................................................................................................ 14
2.4. M ha v gii m ........................................................................................ 16
2.5. iu ch xung m vi sai DPCM................................................................... 18
2.6. iu ch Delta (DM) .................................................................................... 19
2.7. iu ch Delta thch nghi (ADM) ................................................................ 20
2.8. iu ch xung m vi sai thch nghi (ADPCM) ............................................ 22
2.9. Bi thc hnh cc phng php biu din s tn hiu ting ni ................... 22
Chng 3: Phn tch ting ni ........................................................................................ 24
3.1. M u.......................................................................................................... 24
3.2. M hnh phn tch ting ni.......................................................................... 24
3.3. Phn tch ting ni ngn hn......................................................................... 24
3.4. Phn tch ting ni trong min thi gian ...................................................... 26
3.5. Phn tch ting ni trong min tn s ........................................................... 28

iv
3.5.1 Cu trc ph ca tn hiu ting ni........................................................... 28
3.5.2 Spectrogram.............................................................................................. 30
3.6. Phng php phn tch m ha d on tuyn tnh (LPC) .......................... 32
3.7. Phng php phn tch cepstral.................................................................... 39
3.8. Mt s phng php xc nh tn s Formant ............................................. 40
3.9. Mt s phng php xc nh tn s c bn ................................................ 41
3.10. Bi thc hnh phn tch ting ni ................................................................. 44
Chng 4: Tng hp ting ni ....................................................................................... 45
4.1. M u.......................................................................................................... 45
4.2. Cc phng php tng hp ting ni............................................................ 45
4.2.1 Tng hp trc tip .................................................................................... 45
4.2.2 Tng hp ting ni theo Formant ............................................................. 47
4.2.3 Tng hp ting ni theo phng php m phng b my pht m.......... 51
4.3. H thng tng hp ch vit sang ting ni................................................... 52
4.4. Bi thc hnh tng hp ting ni ................................................................. 56
Chng 5: Nhn dng ting ni...................................................................................... 57
5.1. M u.......................................................................................................... 57
5.2. Lch s pht trin cc h thng nhn dng ting ni .................................... 57
5.3. Phn loi cc h thng nhn dng ting ni ................................................. 58
5.4. Cu trc h nhn dng ting ni ................................................................... 59
5.5. Cc phng php phn tch cho nhn dng ting ni................................... 60
5.5.1 Lng t ha vc-t ................................................................................. 60
5.5.2 B x l LPC trong nhn dng ting ni.................................................. 63
5.5.3 Phn tch MFCC trong nhn dng ting ni ............................................. 69
5.6. Gii thiu mt s phng php nhn dng ting ni ................................... 71
5.6.1 Phng php acoustic-phonetic................................................................ 73
5.6.2 Phng php nhn dng mu thng k..................................................... 77
5.6.3 Phng php s dng tr tu nhn to...................................................... 78
5.6.4 ng dng mng n-ron trong h thng nhn dng ting ni ................... 81
5.6.5 H thng nhn dng da trn m hnh Markov n (HMM) ..................... 84
5.7. Bi thc hnh nhn dng ting ni ............................................................... 87

v
Ph lc 1: Mng n-ron ...................................................................................................... 88
Ph lc 2: M hnh Markov n ........................................................................................... 90
Ti liu tham kho............................................................................................................... 94


























Chng 1: Mt s khi nim c bn
1
Chng 1: Mt s khi nim c bn
1.1. M u
Ting ni thng xut hin di nhiu hnh thc m ta gi l m thoi, vic m thoi th
hin kinh nghim ca con ngi. m thoi l mt qu trnh gm nhiu ngi, c s hiu hit
chung v mt nghi thc lun phin nhau ni. Nhng ngi c iu kin th cht v tinh thn
bnh thng th rt d din t ting ni ca mnh, do ting ni l phng tin giao tip
chnh trong lc m thoi. Ting ni c rt nhiu yu t khc h tr nhm gip ngi nghe
hiu c cn din t nh biu hin trn gng mt, c ch, iu b. V c c tnh tc
ng qua li, nn ting ni c s dng trong nhu cu giao tip nhanh chng. Trong khi ,
ch vit li c khong cch v khng gian ln thi gian gia tc gi v ngi c. S biu t
ca ting ni h tr mnh m cho vic ra i cc h thng my tnh c s dng ting ni, v
d nh lu tr ting ni nh l mt loi d liu, hay dng ting ni lm phng tin giao tip
qua li. Nu chng ta c th phn tch qu trnh giao tip qua nhiu lp, th lp thp nht
chnh l m thanh v lp cui cng l ting ni din t ngha mun ni.
1.1.1 Ngun gc ca ting ni
m thanh ca li ni cng nh m thanh trong th gii t nhin xung quanh ta, v bn cht
u l nhng sng m c lan truyn trong mt mi trng nht nh (thng l khng kh).
Khi chng ta ni dy thanh trong hu b chn ng, to nn nhng sng m, sng truyn trong
khng kh n mng nh mt mng mng rt nhy cm ca tai ta lm cho mng nh cng
dao ng, cc dy thn kinh ca mng nh s nhn c cm gic m khi tn s dao ng ca
sng t n mt ln nht nh. Tai con ngi ch cm th c nhng dao ng c tn s
t khong 16Hz n khong 20000Hz. Nhng dao ng trong min tn s ny gi l dao ng
m hay m thanh, v cc sng tng ng gi l sng m. Nhng sng c tn s nh hn 16Hz
gi l sng h m, nhng sng c tn s ln hn 20000Hz gi l sng siu m, con ngi
khng cm nhn c (v d loi di c th nghe c ting siu m). Sng m, sng siu m
v h m khng ch truyn trong khng kh m cn c th lan truyn tt nhng mi trng
rn, lng, do cng c s dng rt nhiu trong cc thit b my mc hin nay.
1.1.2 Phn loi ting ni
Ting ni l m thanh mang mc ch din t thng tin, rt uyn chuyn v c bit. L
cng c ca t duy v tr tu, ting ni mang tnh c trng ca loi ngi. N khng th tch
ring khi nhn vo ton th nhn loi, v nh c ngn ng ting ni m loi ngi sng v
pht trin x hi tin b, c vn ha, vn minh nh ngy nay. Trong qu trnh giao tip ngi
ni, c nhiu cu ni, mi cu gm nhiu t, mi t li c th gm 1 hay nhiu m tit.
ting Vit, s m tit c s dng vo khong 6700. Khi chng ta pht ra mt ting th c rt
nhiu b phn nh li, thanh mn, mi, hng, thanh qun, kt hp vi nhau to thnh
m thanh. m thanh pht ra c lan truyn trong khng kh n tai ngi nhn. V m
thanh pht ra t s kt hp ca rt nhiu b phn, do m thanh mi ln ni khc nhau
hu nh khc nhau dn n kh kh khn khi ta mun phn chia ting ni theo nhng c tnh
ring. Ngi ta ch chia ting ni thnh 3 loi c bn nh sau:
m hu thanh: L m khi pht ra th c thanh, v d nh chng ta ni i, a, hay
o chng hn. Thc ra m hu thanh c to ra l do vic khng kh qua thanh mn
Chng 1: Mt s khi nim c bn
2
(thanh mn to ra s khp m ca dy thanh di s iu khin ca hai sn chp) vi
mt cng ca dy thanh sao cho chng to nn dao ng.
m v thanh: L m khi to ra ting th dy thanh khng rung hoc rung i cht to
ra ging nh ging th, v d h, p hay th.
m bt: pht ra m bt, u tin b my pht m phi ng kn, to nn mt p
sut, sau khng kh c gii phng mt cch t ngt, v d ch, t.
1.2. Qu trnh to ting ni
1.2.1 Cu to ca h thng cu m
Li ni l kt qu ca s hot ng vi mi lin kt gia cc b phn h hp v nhai. Hnh
ng ny din ra di s kim sot ca h thn kinh trung ng, b phn ny thng xuyn
nhn c thng tin bng nhng tc ng ngc ca cc b phn thnh gic v cm gic bn
th. B my h hp cung cp lc cn thit khi kh c th ra bng kh qun. nh kh qun
l thanh qun ni p sut kh c iu bin trc khi n tuyn m ko di t hu n mi
(hnh 1.1).
Thanh qun l tp hp cc c v sn ng bao quanh mt khoang nm phn trn ca kh
qun. Cc dy thanh ging nh l mt i mi i xng nm ngang thanh qun, hai mi ny
c th khp hon ton thanh qun v khi m ra chng c th to ra m hnh tam gic gi l
thanh mn. Khng kh qua thanh qun mt cch t do trong qu trnh th v c trong qu
trnh cu m ca nhng m ic hay m v thanh. Cn cc m hu thanh th li l kt qu ca
s rung ng tun hon ca nhng dy thanh. V nh vy nhng rung ng lin tip s n
c tuyn m. Tuyn m l tp hp nhng khoang nm gia thanh mn v mi, trn hnh ta
c th phn bit c khoang hu (hng), khoang ming v khoang mi.

Hnh 1.1 H thng pht m ca con ngi
Khi ni, lng ngc m rng v thu hp, khng kh c y t phi vo kh qun, i qua
thanh mn do cc dy thanh to thnh. Lung kh ny c gi l tn hiu kch cho tuyn m
v sau n c y qua tuyn m v cui cng tn x ra mi. Tuyn m c th c coi
nh mt ng m hc (gm cc on ng vi di bng nhau v thit din cc mt ct khc
nhau mc ni tip) vi u vo l cc dy thanh (hay thanh mn) v u ra l mi. Nh vy
tuyn m c dng thay i nh mt hm theo thi gian. Cc mt ct ca tuyn m c xc
nh bng v tr ca li, mi, hm, vm ming v thit din ca nhng mt ct ny thay i
t 0cm
2
(khi ngm mi) n khong 20cm
2
(khi h mi). Tuyn mi to thnh tuyn m hc
Chng 1: Mt s khi nim c bn
3
ph tr cho truyn m thanh, n bt u t vm ming v kt thc cc l mi. Khi vm
ming h thp, tuyn mi c ni vi tuyn m v mt m hc v to nn ting ni m mi.
Cc m ca ting ni c to trong h thng ny theo ba cch ph thuc vo tn hiu kch.
m hu thanh nh m /i/ c to nn khi kch tuyn m bng chui xung (hay chu k dao
ng ca i dy thanh) xc nh chu k pitch T v i lng nghch o ca n l tn s c
bn F
0
. i vi ngn ng c thanh iu th kiu thay i ny cn ph thuc vo thanh iu.
m v thanh nh m /s/ c to nn khi cc dy thanh khng dao ng, xung kch c coi
nh cc tp ngu nhin, kch bi cc dng kh xoy qua cc ch hp ca tuyn m (thng l
pha khoang ming). m n nh m /p/ c to ra bng cch ng hon ton tuyn m, gy
nn p sut bn cnh v tr ng, ri nhanh chng gii phng m ny. V tuyn m v tuyn
mi bao gm cc ng m hc c mt ct khc nhau nn khi m truyn trong ng, ph tn s
thay i theo tnh chn lc tn s ca ng. Trong phm vi to ting ni, nhng tn s cng
hng ca tuyn m c gi l tn s formant hay n gin l formant. Nhng tn s ny
ph thuc vo dng v kch thc ca tuyn m, do mi dng tuyn m c c trng
bng mt t hp tn s formant. Cc m khc nhau c to bi s thay i dng ca tuyn
m. Nh vy tnh cht ph ca tn hiu ting ni thay i theo thi gian ging vi s thay i
dng ca tuyn m. Qu trnh truyn m qua tuyn m lm mnh ln mt vng tn s no
bng cng hng v to cho mi m nhng tnh cht ring bit gi l qu trnh pht m.
m c pht c ngha n mang thng tin v m v c tn x ra ngoi t mi. Trong
mt vi trng hp, i vi nhng m mi (nh /m/, /n/ trong ting Anh), tuyn mi cng
tham gia vo qu trnh pht m v m c tn x ra t mi. Tm li, sng tn hiu c ch
to bng ba ng tc: to ngun m (hu thanh v v thanh), pht m khi truyn qua tuyn m
v tn x m t mi hoc t mi, nh hnh 1.2 sau y:


Hnh 1.2 Qu trnh c bn to tn hiu ting ni

1.2.2 Cu to ca h thng tip m
Khng ging nh cc c quan tham gia vo qu trnh to ra ting ni khi thc hin cc
chc nng khc trong c th nh: th, n, ngi. Tai ch s dng cho chc nng nghe. Tai c
bit nhy cm vi nhng tn s trong tn hiu ting ni cha thng tin ph hp nht vi vic
lin lc (nhng tn s xp x 200 5600Hz). Ngi nghe c th phn bit c nhng s
khc bit nh trong thi gian v tn s ca nhng m thanh nm trong vng tn s ny.
Tai gm c ba phn: tai ngoi, tai gia v tai trong. Tai ngoi dn hng nhng thay i p
xut ting ni vo trong mng nh, tai gia s chuyn i p xut ny thnh chuyn ng
c hc. Tai trong chuyn i nhng rung ng c hc ny thnh nhng lung in trong
nron thnh gic dn n no.
Tai ngoi: bao gm LOA TAI (pina) hay TM NH (aurical) v L (meatus) thnh gic
hay ng tai ngoi. Loa tai c tham gia rt t hoc hu nh khng vo thnh ca tai, nhng
Chng 1: Mt s khi nim c bn
4
c chc nng bo v li vo ng tai v dng nh cng tham gia vo kh nng khu bit cc
m, c bit l nhng tn s cao hn. Loa tai ni vi ng tai ngoi, mt ng ngn c hnh
dng thay i c chiu di khong t 25 n 53 cm lm ng cho cc tn hiu m hc n tai
gia. L tai c hai chc nng chnh. Chc nng th nht l bo v cc cu trc phc tp v
khng c tnh cht c hc lm ca tai gia. Chc nng th hai l ng vai tr nh mt b my
cng hng hnh ng vn u tin cho vic truyn cc m c tn s cao gia 2000 Hz v
4000Hz. Chc nng ny l quan trng i vi vic tip nhn li ni v c bit tr gip cho
vic tip nhn cc m xt, v c im ca chng thng c lp m trong ngun nng
lng khng c chu k trong khu vc nh ph m hc ny. S cng hng trong l thnh gic
cng tham gia vo thnh chung ca chng ta gia 500Hz v 4000Hz, vn l mt di tn c
cha nhiu du hiu chnh i vi cu trc m v hc.

Hnh 1.3 Cu trc h thnh gic ngoi

Tai gia bao gm mt khoang nm trong cu trc hp s c cha mng nh (eardrum) -
mng u trong ca ng tai ngoi , mt b ba khc xng lin kt vi nhau, c gi l
xng v (mallet), xng e (anvil) v xng bn p (stirrup) (cng c thut ng l xng
tai (auditory ossicle)) v cu trc c lin kt. Mc ch ca tai gia l truyn nhng bin i
p sut m trong khng kh n tai ngoi vo nhng dch chuyn c kh tng ng. Qu trnh
truyn ny bt u mng nh, b lm lch i bi nhng bin i p sut kh truyn n n
qua l tai. S dch chuyn ny c truyn n cc xng tai, vn ng vai tr nh mt h
thng n by c hc kho lo chuyn ti nhng dch chuyn ny n ca hnh bu dc
giao din n tai trong v cht dch trong l tai trn.
Hot ng lm n by ca cc xng tai, v s thc l mng nh c vng b mt ln hn
nhiu so vi ca hnh bu dc, m bo cho vic truyn hiu ng ca nng lng m hc gia
500Hz v 4000Hz, lm tng n mc ti a kh nng thnh ca tai vng tn s ny. H c
gn vi cc xng tai cng hot ng bo v tai chng li nhng m ln do hot ng c
Chng 1: Mt s khi nim c bn
5
ch phn x m hc. C ch ny i vo hot ng khi cc m c bin khong 90dB v ln
hn truyn n tai: h c kt hp v sp xp li cc xng tai lm gim hiu qu truyn m
n ca hnh bu dc (Borden v Harris 1980, Moore 1989). Tai gia c ni vi hng bng
mt ng hp gi l vi c tai (eustachian tube). iu ny hnh thnh mt ng kh v con
ng ny s m ra khi cn cn bng nhng thay i p sut kh nn gia cu trc tai gia v
tai ngoi. Tai trong l mt cu trc phc tp bc trong hp s, c tai (cochlea) c trch nhim
bin i s chuyn dch c kh thnh cc tn hiu thn kinh: s dch chuyn c kh c
truyn n ca hnh bu dc bng cc c tai c chuyn thnh cc tn hiu thn kinh v cc
tn hiu thn kinh ny c truyn n h thng thn kinh trung ng. V c bn, c tai l
mt cu trc hnh xon tn ht bng mt ca s c mt mng linh hot mi u. bn trong,
c tai chia thnh hai mng, mt trong s , mng nn (basilar membrane) l cc k quan
trng i vi hot ng nghe. Khi nhng dch chuyn (do cc rung ng m gy ra) din ra ti
ca s hnh bu dc, chng c truyn qua cht dch trong c tai v gy ra s dch chuyn
(displacement) ca mng nn. mt u mng nn cng hn so vi u kia, v iu ny c
ngha l cch thc m trong n c dch chuyn ph thuc vo tn s ca m tc ng
vo. Cc m c tn s cao s gy ra s dch chuyn ln hn u cng; vi tn s gim dn,
s dch chuyn cc i s di chuyn lin tc v pha u t cng hn. Gn dc vi mng nn
l c quan v no (organ of corti), mt cu trc phc tp cha nhiu t bo tc. N l s dch
chuyn v s kch thch ca cc t bo tc ny vn bin s dch chuyn ca mng nn thnh
cc tn hiu thn kinh. V mng nn c dch chuyn nhiu v tr khc nhau ph thuc vo
tn s, cho nn c tai v cc cu trc bn trong ca n c th bin tn s v cng ca m
thnh cc tn hiu thn kinh. Nhng cn phi nhn mnh rng s ti hin c tnh thn kinh
cui cng ca thng tin tn s khng ph thuc vo v tr ca ch ring s dch chuyn mng
nn khng, v hiu bit ca chng ta v cch thc tn s c lp m thng qua h thng
thnh gic l cha hon thin.

Hnh 1.4 Mt ct ngang ca c tai
Chng 1: Mt s khi nim c bn
6

Nghin cu u tin v thm nhn li ni ch tnh n rt t cc thuc tnh thm nhn c
bn ca tai. Hn na, n c gng gn kt cc thuc tnh thm nhn ca tn hiu li ni vi
kiu ti hin ph thay i theo thi gian tuyn tnh. n khong nm 1980 nhiu nh nghin
cu nhn ra rng cn phi hiu nhng hiu ng c tnh cht phn tch ca h thnh gic
ngi v cc tn hiu li ni v tht l sai lm khi cho rng ngi nghe ch ang x l thng
tin theo cch ging nh chic my ghi ph bnh thng m thi.

1.3. Cc c tnh c bn ca ting ni
1.3.1 Tn s c bn v ph tn
Thng lng: th tch khng kh vn chuyn qua thanh mn trong mt n v thi gian
(khong 1cm
3
/s).
Chu k c bn T
0
: khi dy thanh rung vi chu k T
0
th thng lng cng bin i tun
hon theo chu k ny v ta gi T
0
l chu k c bn.

Hnh 1.5 Tn s c bn
Gi tr nghch o ca T
0
l F
0
=1/ T
0
c gi l tn s c bn ca ting ni. F
0
ph thuc
vo gii tnh v la tui ca ngi pht m; F
0
thay i theo thanh iu v F
0
cng nh hng
n ng iu ca cu ni.
1.3.2 Biu din tn hiu ting ni
C 3 phng php biu din tn hiu ting ni c bn l:
- Biu din di dng sng theo thi gian.
- Biu din trong min tn s: ph ca tn hiu ting ni.
- Biu din trong khng gian 3 chiu (Sonagram)
a) Dng sng theo thi gian
Phn tn hiu ng vi m v thanh l khng tun hon, ngu nhin v c bin hay nng
lng nh hn ca nguyn m (c khong 1/3).
Ranh gii gia cc t: l cc khong lng (Silent). Ta cn phn bit r cc khong lng vi
m v thanh.
Chng 1: Mt s khi nim c bn
7

Hnh 1.6 Dng sng theo thi gian

m thanh di dng sng c lu tr theo nh dng thng dng trong my tnh l
*.WAV vi cc tn s ly mu thng gp l: 8000Hz, 10000Hz, 11025Hz, 16000Hz,
22050Hz, 32000Hz, 44100Hz,; phn gii hay cn gi l s bt/mu l 8 hoc 16 bt v s
knh l 1 (Mono) hoc 2 (Stereo).
Nh vy, d liu lu tr ca tn hiu m thanh s khc nhau tu theo my thu thanh, thi
im pht m hay ngi pht m, iu ny c th hin r nt trong cc hnh v sau:

Hnh 1.7 m thanh c thu bng 2 micro khc nhau


Hnh 1.8 m thanh do hai ng i khc nhau pht ra
Chng 1: Mt s khi nim c bn
8

Hnh 1.9 m thanh do mt ngi pht ra hai thi im khc nhau

b) Ph tn hiu ting ni
phn trn ta bit rng di tn s ca tn hiu m thanh l khong t 0Hz n 20KHz,
tuy nhin phn ln cng sut nm trong di tn s t 0,3KHz n 3,4KHz. Di y l mt s
hnh nh ca ph tn hiu ting ni:

Hnh 1.10 Ph tn hiu ting ni v ng bao ph

Hnh 1.11 Ph tn hiu ting ni vi s mu khc nhau

Chng 1: Mt s khi nim c bn
9
c) Biu din tn hiu ting ni trong khng gian ba chiu (Sonagram)
biu din trong khng gian 3 chiu ngi ta chia tn hiu thnh cc khung ca s
(frame) ng vi cc quan st nh hnh v 1.12.

Hnh 1.12 Chia tn hiu thnh cc khung ca s
di mt ca s tng ng l 10ms.
Vy, nu tn s F
s
= 16000Hz th ta c 160 mu trn mt ca s.
Cc ca s c on chng ln ln nhau (khong 1/2 ca s).
Tip theo ta v ph ca khung tn hiu trn trc thng ng, bin ph biu din bng
m, nht ca mu sc. Sau ta v theo trc thi gian bng cch chuyn sang ca s tip
theo.

Hnh 1.13 Ph ca mt khung ca s

Hnh 1.14 Cc khung ca s lin nhau v spectrogram tng ng

Biu din tn hiu ting ni theo khng gian 3 chiu l mt cng c rt mnh quan st
v phn tch tn hiu. V d : theo phng thc biu din ny ta c th d dng phn bit m
v thanh v m hu thanh da theo cc c im sau:
+m v thanh:
- Nng lng tp trung tn s cao.
Chng 1: Mt s khi nim c bn
10
- Cc tn s phn b kh ng u trong 2 min tn s cao v tn s thp.
+ m hu thanh:
- Nng lng tp khng ng u.
- C nhng vch cc tr.

Hnh 1.15 m hu thanh

Hnh 1.16 m v thanh

d) Formant v Antiformant
Tuyn m c coi nh mt hc cng hng c tc dng tng cng mt tn s no .
Nhng tn s c tng cng ln c gi l cc Formant. Nu khoang ming c coi l
tuyn m th khoang mi cng c coi nh l mt hc cng hng. Khoang mi v khoang
ming c mc song song nn s lm suy gim mt tn s no v nhng tn s b suy
gim ny c gi l cc AntiFormant.

Hnh 1.17 ng bao ph v cc Formant
Chng 1: Mt s khi nim c bn
11
Da trn hnh 1.17 ta thy c th tnh n Formant th 5 (F5) nhng quan trng nht cn
ch y l cc F1 v F2. Cng mt ngi pht m nhng Formant c th khc nhau. Nu
ta ch cn c vo gi tr ca Formant c trng cho m hu thanh th cha chnh xc m
phi da vo phn b tng i gia cc Formant. Ngoi ra, nu xc nh Formant trc tip
t ph th khng chnh xc m phi da vo ng bao ph, y cng chnh l p ng tn s
ca tuyn m.













Chng 2: Biu din s ca tn hiu ting ni
12
Chng 2: Biu din s ca tn hiu ting ni
2.1. M u
M ho l qu trnh bin i cc gi tr ri rc thnh cc m tng ng. Nhn chung, vic
ly mu lin quan ti qu trnh bin i cc tn hiu lin tc thnh cc tn hiu ri rc ca
trng thi gian gi l PAM (iu ch bin xung m). Vic m ho l qu trnh lng t
ho cc gi tr mu ny thnh cc gi tr ri rc ca trng bin v sau bin i chng
thnh m nh phn hay cc m ghp knh. Khi truyn thng tin m, nhiu xung c yu cu
cho mi gi tr ly mu v v th rng di tn s cn thit cho truyn dn phi c m
rng. ng thi xuyn m, tp m nhit, bin dng mu, mt xung mu, bin dng nn, tp
m m ho, tp m san bng c sinh ra trong lc tin hnh ly mu v m ho. Vic gii m
l qu trnh khi phc cc tn hiu m ho thnh cc tn hiu PAM c lng t ho. Qu
trnh ny tin hnh theo th t o ng nh qu trnh m ho. Mt khc qu trnh lng t
ho, nn v m ho cc tn hiu PAM c gi l qu trnh m ho v qu trnh chuyn i
cc tn hiu PCM thnh D/A, sau , lc chng sau khi gin a v ting ni ban u gi l
qu trnh gii m. Cu hnh c s ca h thng truyn dn PCM i vi vic thay i cc tn
hiu tng t thnh cc tn hiu xung m truyn dn c th hin hnh (pcm1). Trc
tin cc tn hiu u vo c ly mu mt cch tun t, sau c lng t ho thnh cc
gi tr ri rc trn trc bin . Cc gi tr lng t ho c trng bi cc m nh phn. Cc
m nh phn ny c m ho thnh cc dng m thch hp tu theo c tnh ca ng
truyn dn.
Thit b u cui m ho chuyn i cc tn hiu thng tin nh ting ni thnh cc tn hiu
s nh PCM. Khi cc tn hiu thng tin l cc tn hiu tng t, vic chuyn i A/D c
tin hnh v vic chuyn i D/D c tin hnh trng hp ca cc tn hiu s. i khi, qu
trnh nn v m ho bng tn rng c tin hnh bng cch trit s d tha trong qu trnh
tin hnh chuyn i A/D hoc D/D).
Cc quy lut i vi PCM vi phn thch ng 32Kbps c nn gin nh m ho d on
ca cc tn hiu ting c ch r trong cc khuyn ngh G712 ca ITU. Phng php
ADPCM 32 Kbps c chp nhn vo thng 10 nm 1984 c dng chuyn i cc tn
hiu PCM 64 Kbps theo lut A hay lut hin nay sang cc tn hiu ADPCM. Phng php
32 Kbps ADPCM c kh nng chuyn mt lng ting ni ln gp hai ln thm tr cn nhiu
hn phng php qui c 64 Kbps PCM, c chp nhn mt cch rng ri bi b chuyn
m hoc cc thit b u cui m ho vi hiu qu cao. Hin nay cc nc tin tin trn th
gii ang tin hnh nghin cu mt cch ro rit v cng ngh m ho tc khng nhng
cho thoi m c truyn hnh. C th s bn n tip cc phn tip theo.
Chng 2: Biu din s ca tn hiu ting ni
13

Hnh 2.1 Cu hnh h thng truyn v x l thng tin c bn
2.2. Ly mu tn hiu ting ni
Nguyn tc c bn ca iu xung m l qu trnh chuyn i cc tn hiu lin tc nh ting
ni thnh tn hiu s ri rc v sau ti to chng li thnh thng tin ban u. tin hnh
vic ny, cc phn t thng tin c rt ra t cc tn hiu tng t mt cch tun t. Qu trnh
ny c gi l cng vic ly mu.
- Tn hiu ting ni m(t).
- Xung ly mu s(t).
- Chc danh ly mu.
- Tn hiu PAM ly mu.
Theo thuyt ly mu ca Shannon, cc tn hiu ban u c th c khi phc khi tin
hnh cng vic ly mu trn cc phn t tn hiu c truyn i ln hn hoc bng hai ln tn
s cao nht. Cc tn hiu xung ly mu l tn hiu dng sng chu k, l tng cc tn hiu sng
hi c ng bao hm s sin i vi cc tn s. V th, ph tn hiu ting ni to ra sau khi
qua qu trnh ly mu th hin hnh 2.3.
C hai kiu ly mu tu theo dng ca nh rng xung, ly mu t nhin v ly mu
nh bng phng. Ly mu t nhin c tin hnh mt cch l tng khi ph tn s sau khi
ly mu trng vi ph ca cc tn hiu ban u. Tuy nhin trong cc h thng thc t, iu
ny khng th c c. Khi tin hnh ly mu nh bng phng, mt s nn gi l hiu ng
bin ly mu lm xut hin mo. Ngoi ra, nu cc phn t tn hiu u vo vt qu
rng di tn 4 KHz, xut hin s nn qu np gp. V vy, vic lc bng rng cc tn hiu u
vo phi c tin hnh trc khi ly mu.
Chng 2: Biu din s ca tn hiu ting ni
14

Hnh 2.2 Qu trnh ly mu


Hnh 2.3 Ph tn hiu trc v sau ly mu
2.3. Lng t ha
PAM vi bin tng t chuyn i thnh cc tn hiu s l cc tn hiu ri rc sau khi i
qua qu trnh lng t ho. Khi ch th bin ca ting ni lin tc vi s lng hn ch, n
c c trng vi dng sng xp x ca bc. Tp m lng t NQ = Q S tn ti gia
dng sng ban u (S) v dng sng lng t (Q); nu bc nh tp m lng t c
gim i nhng s lng bc u cn thit cho lng t ton b di tn hiu u vo tr nn
rng hn. V th s lng cc dy s m ho tng ln.
Tp m to ra khi bin ca cc tn hiu u vo vt qu dy lng t gi l tp m qu
ti hay tp m bo ho. S/NQ c s dng nh mt n v nh gi nhng u im v
nhc im ca phng php PCM. Khi s lng cc dy s m ho trn mi mu tng ln 1
bit, S/NQ c m rng thm 6 dB.
Chng 2: Biu din s ca tn hiu ting ni
15

Hnh 2.4 Tp m lng t theo bin ca tn hiu u vo

Nh phng php tin hnh m ho hoc gii m, m ng, m khng phi m ng v
m nh gi c th c la chn theo cc kiu ca ngun thng tin. M ng l mt qu
trnh trit s lng tp m lng t sinh ra trn thng tin c gi i bt chp mc u vo.
N c s dng trong mt h thng gi tr tuyt i ca s lng tp m l ti hn hn
S/NQ. M khng phi l m ng c s dng rng di trong mt h thng S/N ca
h thng thu c quan trng hn s lng tuyt i ca tp m nh ting ni. Khi bc
lng t l mt hng s, S/N thay i theo mc tn hiu. Cht lng gi tr nn xu hn khi
mc tn hiu thp. V th i vi cc tn hiu mc thp, bc lng t c gim v i vi
cc tn hiu mc cao n c tng t hoc nhiu cn bng S/N vi mc tn hiu u vo.
Nhng vn trn c tin hnh bng cch nn bin . Mt cch l tng, i vi cc tn
hiu mc thp ng cong nn v gin l truyn tnh. i vi cc tn hiu mc cao chng c
trng bi ng cong i s. Hin nay, ITU-T khuyn ngh lut ( =255) l phng php
15 on (cc h thng ca Hoa K v Nht) v lut (A= 87,6) (cc h thng ca chu u,
trong c Vit nam) l phng php 13 on nh l phng php nn on m cc hm i
s c biu din gn ng vi mt vi ng tuyn tnh.

Hnh 2.5 Lng t ho tuyn tnh v phi tuyn
Chng 2: Biu din s ca tn hiu ting ni
16

Hnh 2.6 Cc c tnh S/NQ ca cc phng php lng t

C hai phng php m ho v phng php nn l ng thi c tin hnh qua bc nn
s s hoc t m ho m khng thm nhng mch ring r khc bi s dng tnh cht tuyn
tnh ca phng php nn on trong s. Mt bng gi tr vi phng php m ho v cch
nn m =255 c ch ra trn bng 2.1.

Bng 2.1 Bng m ho v gii m vi =255
2.4. M ha v gii m
M ho l mt qu trnh so cc gi tr ri rc nhn c bi qu trnh lng t ho vi cc
xung m. Thng thng cc m nh phn c s dng cho vic m ho l cc m nh phn t
nhin, cc m Gray (cc m nh phn phn x), v cc m nh phn kp. Phn ln cc k hiu
m so snh cc tn hiu vo vi in p chuyn nh gi xem c cc tn hiu no khng.
Nh vy, mt b phn chuyn i D/A hoc b gii m l cn thit cho vic to ra in p
Chng 2: Biu din s ca tn hiu ting ni
17
chun. Trong lin lc cng cng PCM, ting ni c biu din vi 8 bits. Tuy nhin trong
trng hp ca lut , cc t PCM c lp nn nh sau (8 bits).
Bit phn cc = {0,1}.
Bit phn on = { 000, 001,, 111}.
Bit phn bc = {0000, 0001, , 1111}.
T on th nht ca tn hiu "+" v tn hiu "" l cc ng thng, c 15 phn on.
Cc "+" ca dng sng tn hiu tng ng vi bit phn cc 0 v cc "", vi "1".
Vic bo hiu c thc hin sau khi thay i "0" ca t PCM sang "1" v "1" sang "0" v
v th, mt lng ln s 1 c thu thp chung quanh mc 0 v s tch cc tn hiu thi
gian trong khi thu nhn c th d dng thc hin. B8 l bt th 8 ca t PCM, i khi c
dng nh l mt bit bo hiu. B7 (hoc B8) chuyn i sang "1" khi mi t ca PCM l "0".
Nh vy, trong cc tn hiu PCM c gi i, cc s "0" lin tc lun lun t hn 16. Mt
khc, khi s dng phng php Bc M, bit B2 ca mi knh c thay i thnh "0" nhm
chuyn i thng tin cnh bo cho i phng. Nht Bn, bit "S" l mt phn ca khung
cc bit ch nh c dng thay th cho mc ch ny. Cc t PCM nhn c, c chuyn
i thnh cc tn hiu PAM bi b gii m. pha thu, cc xung tng ng vi mi knh
c chn lc t cc dy xung ghp knh to ra cc tn hiu PAM. Ri, cc tn hiu ting
ni c phc hi bng mt b lc thng thp.

Hnh 2.7 M ho t PCM
Chng 2: Biu din s ca tn hiu ting ni
18

Hnh 2.8 Qu trnh gii m

Hnh 2.9 Qu trnh gii m v ph
2.5. iu ch xung m vi sai DPCM
y l phng php da trn tnh cht tng quan ca tn hiu ting ni, ch truyn i
chnh lch gia cc mu cnh nhau ca tn hiu ting ni:
Chng 2: Biu din s ca tn hiu ting ni
19

Hnh 2.10 S m ho v gii m DPCM
Tn hiu ting ni tng t vo qua b lc thng thp, hn ch bng tn ca tn hiu vo
(thng l mt na tn s ly mu), my pht lng t v m ho lng t trnh lch gia
xung ly mu tng t x
n
v tn hiu d on x
n
ly t u ra b d on x

n
. Gi tr d on
ca mu tip theo c c nh ngoi suy t p gi tr mu cho trc:

1
'( ) '
p
i n i
i
x n a x

=
=

(2.1)
a
i
l h s ca cc b d on, chnh lch gia xung ly mu u vo v tn hiu ra ly
mu l:
'( )
n n
e x x n = (2.2)
y chnh l gi tr dng lng t ho v truyn i, pha thu s tin hnh hi phc li
tn hiu sai s ny v tch phn li cng vi tn hiu hi phc trc , tuy nhin gim
li cng li ca nhiu ln ta dng phia thu mt b d on ging vi pha pht. Vic s dng
vng phn hi gip cho b lng t hn ch chnh lch gia sai s e
n
v s
i
s c lng
t e

n
(e

n
-e
n
). Nu gi tr ny cng nh th cht lng ting ni cng tt, theo cc tnh ton th
phng php ny c rng bng tn i mt na.
2.6. iu ch Delta (DM)
iu ch DM l mt loi iu ch DPCM trong mi t m ch c mt bt nh phn, c
u im mch in d dng ch to ( hnh di ). Tn hiu thoi sau khi c lc bng tn
0,3-3,4Khz c ri rc ho to thnh tn hiu PAM x
n
, so snh tn hiu ny vi tn hiu d
on x

n
, lch gia hai gi tr ny (e
n
) c lng t thnh mt trong hai gi tr -, hoc
+. Pha ra b lng t ho s truyn i mt bit nh phn cho mi xung ly mu. Ti pha thu
cc gi tr c cng vi cc gi tr d on tc thi pha ra b gii m khi phc li ting
Chng 2: Biu din s ca tn hiu ting ni
20
ni ban u. Tc bit ca iu ch delta bng tc ca tn s ly mu, tc l 8 kbps.
Phng php ny nh ni l kh n gin, t c tc m ho rt thp, n l phng
php duy nht ca phng php m ho dng sng c th so snh v tc vi phng php
tham s ngun v tc , song cht lng tn hiu m ho khng cao, khng m bo c
phm vi ng ca h thng PCM.
2.7. iu ch Delta thch nghi (ADM)
Phng php ny cn gi l phng php iu ch delta c dc thay i lin tc.
Phng php ny khc phc cho iu ch delta v kh nng di ng, phng php ny da
trn phng php thay i ng h s khuych i ca b tch phn ph hp vi mc cng
sut trung bnh ca tn hiu vo.

Hnh 2.11 S m ho v gii m Delta

Hnh 2.12 Dng sng tn hiu ca iu ch DM
Chng 2: Biu din s ca tn hiu ting ni
21

C ca bc lng t thay i nh thay i h s khuych i ca b tch phn nh mch
RC v mch bnh phng, khi tn hiu vo l hng s hoc thay i chm theo thi gian th b
iu ch ny s tm kim v a ra mt dy xung c cc tnh xen k, mch RC ly trung bnh
cc dy ny, khi n a ra ga tr bng zero. C ngha l tn hiu iu khin lm h s
khuych i ca b khuych i thay i rt t. u ra b khuych i c bc kch thc
nh, khi tn hiu vo c sn dc th hm bc thang c to ra kp dc ca tn hiu vo.
Lc s to ra mt lot xung m mch RC ly trung bnh lot xung ny v a ra in p
iu khin ln, tc l c ca bc tng ln, nh mch bnh phng nn in iu khin b
khuych i lun lun dng, m khng ph thuc cc tnh ca xung th no phng php
ny c kh nng gim mo do qu ti sn v tp m ht.

Hnh 2.13 Dng sng tn hiu trong ADM


Hnh 2.14 S m ho v gii m ADM
Chng 2: Biu din s ca tn hiu ting ni
22
2.8. iu ch xung m vi sai thch nghi (ADPCM)
y l phng php m ho kh quan trng, tp hp c nhng u im ca cc phng
php trn v c ITU-T tiu chun ho trong khuyn ngh G721, v c nhiu ng dng
trong thc t nh h thng di ng CT2 ca Hn Quc, DECT ca M. V vy ta s nghin
cu su phng php. Cc tc c tiu chun l 40, 32, 24, 26 kbps. Phng php ny
da trn tnh cht thay i chm ca phng sai v hm t tng quan, vi phng php
PCM ta dng b lng t u c cng sut tp m l
2
/12, phng php ADPCM v cc
phng php d on tuyn tnh ni chung l thay i hay cn gi l phng php dng b
lng t ho t thch nghi. Cc thut ton c pht trin cho h thng iu xung m vi sai
khi khi m ho tn hiu ting ni bng cch s dng b lng t ho v b d on thch nghi,
c thng s thay i theo chu k phn nh tnh thng k ca tn hiu ting ni.

Hnh 2.15 S m ho ADPCM

Hnh 2.16 S gii m ADPCM
2.9. Bi thc hnh cc phng php biu din s tn hiu ting ni
S dng my tnh c nhn v phn mm Matlab (hoc cc ngn ng lp trnh khc) thc
hin cc cng vic sau:
Ghi m mt on tn hiu ting ni bt k. Lu tp nh dng th (*.wav).
S dng Matlab hoc cc ngn ng lp trnh khc c v hin th tn hiu theo dng sng
min thi gian.
Chng 2: Biu din s ca tn hiu ting ni
23
Biu din ph ca mt phn on tn hiu vi cc dng hm ca s khc nhau.
S dng mt trong cc phng php bin i hc trong chng ny cho on tn hiu.
Kt qu thu c c kim tra theo cc tiu ch: dung lng tp, cht lng m thanh cm
th,
Chng 3: Phn tch ting ni
24
Chng 3: Phn tch ting ni
3.1. M u
Trong chng ny chng ta s xem xt cc phng php phn tch tn hiu ting ni. Phn
tch ting ni thc hin gii quyt cc vn tm ra mt dng thc ti u biu din c ting
ni mt cc hiu qu. N l c s cho vic pht trin cc k thut, cng ngh tng hp, nhn
dng v nng cao cht lng tn hiu ting ni. Phn tch ting ni thng thc hin vic trch
chn hoc chuyn i tn hiu ting ni sang mt dng thc biu din khc sao cho c th
biu din thng tin ting ni tt hn theo cch m chng ta cn. Mt cch tng qut, hu ht
cc phng php phn tch tn hiu ting ni tp trung vo mt trong ba vn chnh. Th
nht l tm cch loi b nh hng ca pha, thnh phn khng ng vai trong quan trng
trong vic truyn ti thng tin ting ni. Th hai, thc hin vic chia tch ngun m v mch
lc (m hnh tuyn m) sao cho chng ta c th nghin cu bin ph ca tn hiu mt cch
c lp. Cui cng l chuyn i tn hiu hoc bin ph tn hiu sang mt dng biu din khc
hiu qu hn.
3.2. M hnh phn tch ting ni
M hnh tng qut cho vic phn tch ting ni c trnh by trong hnh 3.1. Cc dng tn
hiu ti cc bc cng c trnh by km theo trong minh ha.
Tn hiu ting ni c tin x l bng cch cho qua mt b lc thng thp vi tn s ct
khong 8kHz. Tn hiu thu c sau c thc hin qu trnh bin i sang dng tn hiu
s nh b bin i ADC. Thng thng, tn s ly mu bng 16kHz vi tc bt lng t
ha l 16bit.
Tn hiu ting ni dng s c phn khung vi chiu di khung thng khong 30ms v
khong lch cc khung thng bng 10ms. Khung phn tch tn hiu sau c chnh bin
bng cch ly ca s vi cc hm ca s ph bin nh Hamming, Hanning.... Tn hiu thu
c sau khi ly ca s c a vo phn tch vi cc phng php phn tch ph (chng
hn nh STFT, LPC,...). Hoc sau khi phn tch ph c bn, tip tc c a n cc khi
trch chn cc c trng.
3.3. Phn tch ting ni ngn hn
Trong l thuyt phn tch, chng ta thng khng n mt im quan trng l cc
phn tch phi c tin hnh trong mt khong thi gian gii hn. Chng hn, chng ta bit
rng bin i Fourier theo thi gian lin tc l mt cng c v cng hu ch cho vic phn
tch tn hiu. Tuy nhin, n yu cu phi bit c tn hiu trong mi khong thi gian. Hn
na, cc tnh cht hay c trng ca tn hiu m chng ta cn tm hiu phi l cc i lng
khng i theo thi gian. iu ny trong thc t phn tch tn hiu kh m t c v vic
phn tch tn hiu p ng cc ng dng thc t c thi gian hu hn. Hu ht cc tn hiu,
c bit l tn hiu ting ni, khng phi l tn hiu khng i theo thi gian.
Chng 3: Phn tch ting ni
25

Hnh 3.1 M hnh tng qut ca vic x l tn hiu ting ni

V mt nguyn l, chng ta c th p dng cc k thut phn tch bit vo phn tch tn
hiu trong ngn hn. Tuy nhin v tn hiu ting ni l mt qu trnh mang thng tin ng nn
chng ta khng th ch n thun xem xt phn tch ngn hn trong ch mt khung thi gian
n l.
Tn hiu ting ni nh cp l tn hiu thay i theo thi gian. N c cc c trng c
bn nh ngun kch thch (excitation), cng (pitch), bin (amplitude), ... Cc tham s
thay i theo thi gian ca tn hiu ting ni c th k n l tn s c bn (fundamental
frequency - pitch), loi m (m hu thanh - voiced, v thanh - unvoiced, tc - fricative hay
khong lng - silence), cc tn s cng hng chnh (formant), hm din tch ca tuyn m
(vocal tract area), ...
Vic thc hin phn tch ngn hn tc l xem xt tn hiu trong mt khong nh thi gian
xung quanh thi im ang xt n no . Cc khong ny thng khong t 10-30ms. iu
ny cho php chng ta gi thit rng trong khong thi gian cc tnh cht ca dng sng tn
hiu ting ni l tng i n nh. Khong nh tn hiu dng phn tch thng c gi
l mt khung (frame), hay mt on (segment). Mt khung tn hiu c xc nh l tch ca
mt hm ca s dch w(m) v dy tn hiu s(n):


Chng 3: Phn tch ting ni
26
( ) ( ) ( )
n
s m s m w n m = (3.1)
Mt khung tn hiu c th c hiu nh mt on tn hiu c ct gt bi mt hm ca
s to thnh mt dy mi m cc gi tr ca n bng khng bn ngoi khong n[m-
N+1,m]. T cng thc (3.1) chng ta thy rng khung tn hiu ny ph thuc vo khong thi
gian kt thc m. Trong khung tn hiu nh va c nh ngha, d dng thy rng cc php
x l ngn hn cng c ngha tng ng cc php x l di hn.
Nh cp, vic phn tch tn hiu ting ni khng th n gin ch bng phn tch mt
khung tn hiu n l m phi bng cc phn tch ca cc khung tn hiu lin tip. Thc t,
trnh mt thng tin, cc khung tn hiu thng c ly bao trm nhau. Ni mt cc khc, hai
khung cnh nhau c chung t nht M>0 mu. Hnh 3.2 minh ha vic phn chia khung vi
hm ca s.

Hnh 3.2 Phn tch tn hiu trn cc khung bao trm nhau

Mt php phn tch ngn hn tng qut c th biu din l:
( ) ( ) ( ) {s w }
n
m
X m T m n m

=
=

(3.2)
trong , X
n
biu din tham s phn tch (hoc vc-t cc tham s phn tch) ti thi im
phn tch n. Ton t T{} nh ngha mt hm phn tch ngn hn. Tng (3.2) c tnh vi
gii hn v cng c hiu l php ly tng c thc hin vi tt c cc thnh phn khc
khng ca khung tn hiu l kt qu ca php ly ca s. Ni cch khc, tng c thc hin
vi mi gi tr ca m trong tp xc nh (support) ca hm ca s.
Mt s hm ca s ph bin thng hay c s dng l: hm ca s ch nht
(rectangular window), hm ca s Hanning, v hm ca s Hamming.
3.4. Phn tch ting ni trong min thi gian
Vic phn tch ting ni trong min thi gian tc l phn tch trc tip trn dng sng tn
hiu sau khi thc hin vic ly ca s trong min thi gian. Nh cp trong phn trc,
chng ta ch xem xt cc phn tch ngn hn ca tn hiu. V vy, n gin trong trnh by
chng ta mc nh cc cng thc xy dng l cc phn tch ngn hn. Trong trng hp nu
cc phn tch khng phi l ngn hn th chng s c ch thch r rng.
Chng 3: Phn tch ting ni
27
a) Nng lng trung bnh
Tham s u tin chng ta cn quan tm trong phn tch tn hiu ting ni trong min thi
gian l nng lng trung bnh. Nng lng trung bnh ca tn hiu ting ni c xc nh
nh sau:
( ) ( ) ( ) ( ) ( )
2 2
w
n n
m m
E s m s m n m

= =
= =

(3.3)
Vic xc nh nng lng trung bnh ca tn hiu rt hu ch trong vic c lng cc tnh
cht ca cc hm kch thch trong m hnh m phng b my pht m hay cc m hnh tng
hp tn hiu ting ni. Ngoi ra, n cung cp cho chng ta mt cng c hu ch pht hin
mt tn hiu m l ca m hu thanh, v thanh hay mt khong lng. iu ny l bi v bin
tn hiu m v thanh thng rt nh hn so vi bin tn hiu m hu thanh.
Cn ch rng di ca s phn tch phi c chn thch hp. N phi di s
thay i ca nng lng tn hiu trong mt khung c th c lm mn. Tuy nhin cng
khng c qu di dn n lut thay i nng lng tn hiu t mt on ny sang mt on
tn hiu khc b hiu lm.
Mt nhc im ca vic s dng nng lng trung bnh ca tn hiu l vi cc mc tn
hiu ln, chng c xu th lm lch mt cch ng k gi tr c lng nng lng ton khung.
b) ln bin trung bnh
Nh cp trong phn trn, nng lng trung bnh tn hiu kh nhy cm vi ln
ca tn hiu. Do , ngi ta thng hay s dng mt i lng thay th l ln bin
trung bnh, c xc nh bi:
( ) ( ) | | w
n
m
M s m n m

=
=

(3.4)
c) Tc tr v khng
Mt tham s khc cng thng c quan tm trong cc php phn tch tn hiu ting ni
trong min thi gian l tc tr v khng (zero-crossing rate). S kin tr v khng xy
ra khi tn dng sng tn hiu ct trc honh hay ni cch khc khi cc mu lin tc nhau c
du khc nhau. V mt ton hc, tc tr v khng c xc nh nh sau:
( ) ( ) ( ) 0, 5 sgn{s } sgn{s 1 } w
n
m
Z m m n m

=
=

(3.5)
Trong hm sgn(a) l hm du: bng 1 nu a0; bng -1 nu a<0. D thy 0,5|sgn{s(m)}-
sgn{s(m-1)}| bng 1 nu s(m) v s(m-1) khc du nhau v bng 0 nu chng cng du. iu
ny ngha l Z
n
l tng trng s ca tt c cc thay i du ca cc mu trong vng xc nh
(support) ca ca s dch w(n-m). Tc tr v khng c th xem nh l mt o lng ca
tn s. Mc d tc tr v khng thay i kh ln theo thi gian v loi tn hiu, nhng n
biu hin s khc bit r rt vi tn hiu m v thanh v hu thanh. Cc tn hiu m hu thanh
c s suy gim ln vng tn cao do c tnh t nhin thng thp ca cc xung dy thanh
(glottal pulse), trong khi cc tn hiu m v thanh c nng lng ln vng tn cao. Do vy,
cng nh i lng nng lng trung bnh tn hiu, tc tr v khng cng l cc tham s
quan trng pht hin xem mt tn hiu l tn hiu ca m v thanh, hu thanh hay khong
lng.
Chng 3: Phn tch ting ni
28
d) Hm t tng quan
Hm t tng quan thng c s dng nh mt cng c xc nh tnh chu k ca tn
hiu v n cng l c s cho nhiu phng php phn tch ph khc. Hm t tng quan
c nh ngha tng t nh hm t tng quan thng thng:

( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( )
w w
w
n n n
m
m
n
m
k s m s m k
s m n m s m k n k m
s m s n m n m

=
= +
= +
=


(3.6)
Trong cng thc (3.6) chng ta s dng tnh cht ca hm t tng quan l mt hm
chn, i xng v ( ) ( ) ( ) w w w
k
m m m k = + .
Cng tng t nh hm t tng quan tn hiu chng ta bit, c mt mi quan h gia
hm t tng quan v nng lng trung bnh tn hiu nh sau:
( ) ( ) ( ) ( )
2
w 0
n n
m
E s m n m

=
= =

(3.7)

e) Hm vi phn bin trung bnh
Hm vi phn bin trung bnh c nh ngha nh sau:
( ) ( ) ( ) | | w
n
m
M s m s m n m

=
=

(3.8)
Cng thc (3.8) cho thy gi tr hm vi phn bin trung bnh, vi tham s v s khc
nhau v thi gian s rt nh khi tin n chu k (nu c) ca tn hiu s(n). Do hm vi
phn bin trung bnh l mt trong cc cng c hu ch cho vic xc nh tn s c bn ca
tn hiu ting ni.
3.5. Phn tch ting ni trong min tn s
3.5.1 Cu trc ph ca tn hiu ting ni
Trong phn tch tn hiu ting ni, thay v s dng trc tip tn hiu ting ni trong min
thi gian, ngi ta thng hay s dng cc c trng ph ca ting ni. iu ny xut pht t
quan im rng tn hiu ting ni cng ging nh cc tn hiu xc nh khc c th xem nh
l tng ca cc tn hiu hnh sin vi bin v pha thay i chm. Hn na, mt nguyn nhn
quan trng khng km l vic cm nhn ting ni ca con ngi lin quan trc tip n
thng tin ph ca tn hiu ting ni nhiu hn trong khi cc thng tin v pha ca tn hiu ting
ni khng c vai tr quyt nh.
Ph bin phc ca tn hiu ting ni c nh ngha l bin i Fourier (FT) ca
khung tn hiu vi khong thi gian phn tch n c nh:

( ) ( ) ( ) w
j j m
n
m
S e s m n m e

=
=

(3.9)
Biu thc (3.9) c th vit li nh sau:
Chng 3: Phn tch ting ni
29

( ) ( ) ( ) ( ) *w |
j j n
n n n
S e s n e n

=
=

(3.10)
Biu thc (3.10) c gi l mt cch din dch php bin i Fourier ri rc theo kha
cnh mch lc. Tn hiu iu bin $s(\tilde{n})e^{-j\omega \tilde{n}}$ dch ph ca
$s(\tilde{n})$ xung ln v kt qu thu c s c la chn bi mt b lc ca s thng
di vi tn s trung tm bng khng.
Mt khc cng thc (3.9) cng c th vit l:

( ) ( ) ( ) ( ) ( )
* w * |
j j n j n
n n n
S e s n n e e

=
=

(3.11)
Cng thc (3.11) c th din gii nh sau. Tn hiu ( ) s n c a qua b lc thng di c
tn s trung tm v p ng xung ( ) w
j n
n e

. Kt qu thu c c dch tn xung bng
cch iu ch bin vi
j n
e

to ra tn hiu bng tn thp.
Hnh 3.3 minh ha mt khung tn hiu v ph tng ng.
Mt ph cng sut trong mt khong thi gian ngn, tc l ph ngn hn ca tn hiu
ting ni, c th c xem nh l tch ca hai thnh phn: thnh phn th nht l ng bin
ph thay i mt cch chm chp theo tn s; thnh phn th hai l cu trc ph mn (spectral
fine structure) thay i rt nhanh theo tn s. i vi cc m hu thanh th cu trc ph mn
to thnh cc mu tun hon, cn i vi cc m v thanh th khng. Bin ph, hay cng
chnh l c trng ph tng qut (overall), m t khng ch cc c tnh (characteristics) cng
hng v phn cng hng (anti-resonance) ca cc c quan pht m (articulatory organs) m
cn m t cc c trng tng qut ca pht x (radiation) v ph ngun glottal mi v
khoang mi. Trong khi , cu trc ph mn m t tnh tun hon ca ngun m.
Cng thc (3.9) l mt hm ca tn s phn tch lin tc . Do FT tr thnh mt
cng c hu ch trong cc phn tch thc t chng ta cn tnh ton n vi tp tn s ri rc v
hm ca s c b rng hu hn vi mi bc dch chuyn R>1. Khi chng ta c:
( ) ( ) ( ) ( )
2
1
w 0,1,..., 1
k
rR
j m
N
rR
m rR L
S k s m rR m e k N

= +
= =

(3.12)
N l s cc tn s cch u nhau trong khong 0 2, L l di hm ca s (o
lng bng s mu). V chng ta gi thit hm ca s w(n) l hm c tnh nhn qu v c gi
tr khc khng ch trong khong 0 m L-1 do phn tn hiu ly qua ca s s(m)w(rR-m)
s c gi tr khc khng trn khong rR-L+1 m rR.

Chng 3: Phn tch ting ni
30


Hnh 3.3 Khung tn hiu v ph tng ng

3.5.2 Spectrogram
Spectrogram l mt trong nhng cng c c bn ca phn tch ph tn hiu ting ni, trong
n chuyn i dng sng tn hiu ting ni hai chiu thanh cu trc ba chiu (bin /tn
s/thi gian). Trong spectrogram, thi gian v tn s tng ng l cc trc ngang v dc,
cn bin c biu din bi m nht. Cc nh ca ph tn hiu xut hin l cc di
nm ngang mu m. Tn s trung tm ca cc di thng c coi l cc formant. Cc m
hu thanh to ra cc mng dc trong biu spectrogram bi v c mt s tng cng bin
tn hiu ting ni mi khi thanh qun ng li. Nhiu trong cc m v thanh to ra cc cu
trc m hnh ch nht v kt thc ngu nhin vi nhiu m nht do s thay i tc th ca
nng lng tn hiu. Lc spectrogram ch din t bin ph ca tn hiu m b qua cc
Chng 3: Phn tch ting ni
31
thng tin v pha bi v cc thng tin v pha c cho rng khng c vai tr quan trng trong
hu ht cc ng dng lin quan n ting ni.
xy dng lc spectrogram, ngi ta thc hin vic biu din bin ca bin i
Fourier ngn hn (STFT) |S
n
(e
j
)| theo thi gian trn trc nm ngang, ng thi theo tn s
(t 0 n ) trn trc thng ng (tc l t 0 n F
s
/2, vi F
s
l tn s ly mu), ng thi
ln bin bng m nht (thng theo thang t l l-ga-rt)
( ) ( )
10
, 20log | |
r k rR
n
S t f S k =

(3.13)
Trong t
r
=rRT v f
k
=k/(NT) v T l chu k ly mu ca tn hiu. Hnh 3.4 minh ha
spectrogram ca tn hiu ting ni cng vi dng sng tn hiu tng ng.

Hnh 3.4 Lc spectrogram ca tn hiu ting ni "Should we chase"

Hai lc spectrogram c xy dng vi cc hm ca s c di khc nhau.Lc
spectrogram pha trn l k qu khi s dng ca s c chiu di 101 mu tng ng vi 10ms.
Chiu di ca ca s phn tch ny xp x bng chu k ca dng sng trong cc khong tn
hiu m hu thanh. Kt qu l trong cc khong tn hiu m hu thanh, spectrogram biu hin
cc vn nh hng thng ng tng ng vi thc t rng ca s trt lc gom hu ht cc
mu c bin ln, lc gom hu ht cc mu c bin nh. Ni mt cch khc, khi ca s
phn tch c di ngn, mi chu k pitch ring r c hin th r nt theo thi gian, trong
khi phn gii theo tn s th rt km. Cng chnh v l do ny, nu chiu di ca s phn
tch m ngn, th lc spectrogram thu c gi l lc spectrogram bng rng. Ngc
li, nu chiu di ca s phn tch ln, th lc spectrogram thu c gi l lc
spectrogram bng hp. Lc spectrogram bng hp c phn gii theo tn s cao nhng
theo thi gian th nh. Minh ha pha di ca hnh 3.4 l kt qu ca vic s dng ca s
phn tch c di 401 mu, tng ng vi 40ms, bng khong vi chu k tn hiu. V nh
Chng 3: Phn tch ting ni
32
chng ta thy, lc spectrogram tng ng khng cn nhy vi s thay i v thi gian
na.
3.6. Phng php phn tch m ha d on tuyn tnh (LPC)
Phng php phn tch d on tuyn tnh l mt trong cc phng php phn tch tn hiu
ting ni mnh nht v c s dng ph bin. im quan trng ca phng php ny nm
kh nng n c th cung cp cc c lng chnh xc ca cc tham s tn hiu ting ni v
kh nng thc hin tnh ton tng i nhanh.
M hnh ca phng php phn tch tn hiu ting ni da trn m d on tuyn tnh
(LPC- Linear Predictive Coding) c trnh by trong hnh v 3.5. Phng php phn tch
LPC thc hin vic phn tch ph trn cc khung (khi - block) tn hiu hay cn gi l cc
khung tn hiu (speech frames) bng vic s dng mt m hnh ha ton im cc. iu ny
c ngha l kt qu biu din ph thu c X
n
(e
j
) c gii hn trong dng /A(e
j
), trong
A(e
j
) l mt a thc bc p tng ng khi thc hin php bin i z:
( )
1 2
1 2
1 ...
p
p
A z a z a z a z

= + + + + (3.14)


Hnh 3.5 M hnh phn tch LPC cho tn hiu ting ni

Bc ca a thc, p, cn c gi l bc phn tch LPC. Kt qu thu c t khi phn tch
ph LPC l mt vc-t cc h s (cn gi l cc tham s LPC) c th ha (specify) ph ca
mt m hnh ton im cc m ph hp nht vi ph tn hiu gc trn ton khong thi gian
xem xt cc mu tn hiu.
tng ng sau vic s dng m hnh LPC l vic c th xp x mt mu tn hiu ting
ni thi im n bt k, ( ) s n , nh l mt t hp tuyn tnh ca p mu trc . Ni cch
khc:
( ) ( ) ( ) ( )
1 2
1 2 ...
p
s n a s n a s n a s n p + + + (3.15)
Cc h s a
1
, a
2
, , a
p
c gi thit l khng i trong khung phn tch tn hiu. Biu
thc (3.15) c th c vit li thnh ng thc nu ta thm vo mt thnh phn kch thch
(excitation term) Gu(n), ta c:
( ) ( ) ( )
1
p
i
i
s n a s n i Gu n
=
= +

(3.16)
Chng 3: Phn tch ting ni
33
Trong cng thc (3.16), u(n) l thnh phn kch thch chun v G l h s khuch i ca
thnh phn kch thch. Nu xem xt biu thc (316) trong min z chng ta c biu thc:
( ) ( ) ( )
1
p
i
i
i
S z a z S z GU z

=
= +

(3.17)
Hay hm truyn t tng ng l:
( )
( )
( ) ( )
1
1 1
1
p
i
i
i
S z
H z
GU z A z
a z

=
= = =

(3.18)
Hm truyn t (3.18) c th c thc hin bi s khi trong hnh 3.6. S khi
c th c gii thch nh sau. Ngun kch thch chun ha u(n) c nhn vi h s khuch
i G tr thnh u vo ca mt h thng ton im cc H(z)=1/A(z) to ra tn hiu ting
ni s(n). Chng ta bit rng hm kch thch thc ca tn hiu ting ni l dy xung bn tun
hon i vi tn hiu m hu thanh v l ngun nhiu ngu nhin i vi tn hiu m v thanh.
T thc t ny, d dng xy dng c mch tng hp tn hiu ting ni da vo m hnh
phn tch LPC nh trong hnh 3.7. Trong s tng hp ting ni s dng m hnh phn tch
LPC, ngun kch thch c chn tng ng ph hp vi tn hiu m hu thanh hay v thanh
nh mt chuyn mch. H s khuch i G ca tn hiu c c lng t tn hiu ting ni.
Mch lc s H(z) c iu khin bi cc tham s ca b my pht m tng ng vi tn hiu
ting ni c to ra. Ni mt cch c th, cc tham s ca m hnh tng hp ny l cc phn
loi (classification) m hu thanh hay v thanh, khong chu k pitch (pitch period) ca tn
hiu, tham s khuch i, cc h s ca b lc a
k
. Tt c cc tham s ny thay i chm
theo thi gian.

Hnh 3.6 M hnh d on m phng ting ni
Gi s rng t hp tuyn tnh ca cc mu trc thi im xem xt l mt c lng ca
tn hiu, k hiu l ( ) s n :
( ) ( )
1
p
k
k
s n a s n k
=
=

(3.19)
Khi , sai s d tnh e(n) s c tnh l:
( ) ( ) ( ) ( ) ( )
1
p
k
k
e n s n s n s n a s n k
=
= =

(3.20)
Hay ni cch khc, hm truyn t sai s tng ng l:
( )
( )
( )
1
1
p
k
k
k
E z
A z a z
S z

=
= =

(3.21)
Chng 3: Phn tch ting ni
34
T y ta thy rng, nu tn hiu ting ni c to ra t s mch 3.6 th sai s d on
e(n) s bng tn hiu kch thch Gu(n).
Vn t ra i vi phng php phn tch LPC l xc nh c tp cc h s a
k
mt
cch trc tip t tn hiu ting ni sao cho tnh cht ph ca mch lc trong s 3.7 tng
ng vi ph ca tn hiu ting ni trong khong ca s phn tch. V c tnh ph ca tn
hiu ting ni lun thay i theo thi gian, cc h s d on thi im n xc nh phi l
nhng gi tr c c lng t cc on ngn hn ca tn hiu ting ni xung quanh thi
im n. T y chng ta thy phng php tip cn c bn l tm c mt tp cc h s d
on (predictor coefficients) sao cho chng lm ti thiu ha sai s d on trung bnh bnh
phng trn ton on ngn hn ca tn hiu phn tch. Thng th phng php phn tch
ph theo cch ny c thc hin trn cc khung tn hiu lin tip m khong cch gia cc
khung vo khong bc ca 10ms.

Hnh 3.7 M hnh tng hp ting ni dng LPC

xy dng biu thc v t tm ra c cc h s d on thch hp, chng ta nh
ngha cc khung tn hiu ngn hn v tng ng l cc sai s ngn hn:
( ) ( )
n
s m s n m = + (3.22)
( ) ( )
n
e n e n m = + (3.23)
Chng ta cn ti thiu ha tn hiu sai s trung bnh bnh phng thi im n:
( )
2
n n
m
e m =

(3.24)
Biu thc (3.24) c th c vit li bng cch s dng cc nh ngha e
n
(m) v s
n
(m) nh
sau:
( ) ( )
2
1
p
n n k n
m k
s m a s m k
=
(
=
(


(3.25)

tm cc tiu ca (3.25), chng ta ly o hm ln lt theo cc h s a
k
v cho chng
bng khng:
Chng 3: Phn tch ting ni
35
( ) 0 1, 2,...,
n
k
k p
a

= =

(3.26)
Khi chng ta c:
( ) ( ) ( ) ( )
1

p
n n k n n
m k m
s m i s m a s m i s m k
=
=

(3.27)
Chng ta bit rng h s c dng ( ) ( )
n n
s m i s m k

l cc thnh phn ca covariance


ngn hn ca s
n
(m). Ni cch khc:
( ) ( ) ( ) ,
n n n
m
i k s m i s m k =

(3.28)
Chng ta c th thu gn biu thc (3.27) nh sau:
( ) ( )
1
, 0 ,
p
n k n
k
i a i k
=
=

(3.29)
Biu thc (3.29) biu din h thng gm p biu thc ca p bin s. D c gi tr sai s
trung bnh bnh phng ti thiu,
n
c tnh nh sau:

( ) ( ) ( )
( ) ( )
2
1
1

0, 0 0,
p
n n k n n
m k m
p
n k n
k
s m a s m s m k
a k

=
=
=
=

(3.30)
Chng ta thy rng, gi tr sai s trung bnh bnh phng ti thiu c cha mt thnh phn
c nh
n
(0,0) v cc thnh phn khc ph thuc vo cc h s d on.
tm cc h s d on ti u
k
a trc ht chng ta phi tnh
n
(i,k) (1 i p v 0 k
p) v sau gii h (3.29) ng thi ca p biu thc. Trong thc t, vic gii h v tnh ton
cc thnh phn ph thuc rt nhiu vo khong thi gian m c s dng nh ra khung
tn hiu phn tch v vng m trn sai s trung bnh bnh phng c c lng. C hai
phng php chun nh ra khong thch hp cho tn hiu ting ni: phng php s dng
s t tng quan; v phng php s dng covariance.
Phng php s dng hm t tng quan xut pht trc tip t vic nh ra khong gii
hn m trong t hp tuyn tnh sao cho on tn hiu ting ni s
n
(m) bng 0 ngoi khong 0
m N-1. iu ny tng ng vi vic gi thit tn hiu ting ni s(n+m) c nhn vi
hm ca s w(m) hu hn c gi tr bng 0 ngoi khong 0m N-1. Ni mt cch khc,
mu tn hiu ting ni lm ti thiu ha sai s trung bnh bnh phng c th biu din
di dng:
( )
( ) ( )
[ ]
w 0 1
0 0, 1
n
s n m m m N
s m
m N
+

(3.31)

T cng thc (3.31), khi m<0 tn hiu sai s e
n
(m) bng 0 v khi s
n
(m)=0. Mt khc,
cng tng t khi m>N-1+p s khng c sai s d on bi v khi ta cng c s
n
(m)=0. Tuy
nhin trong vng m=0 (tc l t m=0 n m=p-1) tn hiu thu c sau khi thc hin vic ly
ca s c th c d on t cc mu trc , m mt s trong chng c th bng 0. V
Chng 3: Phn tch ting ni
36
nh vy, kh nng sai s d on tng i ln c th tn ti trong vng ny. Ti vng m=N-
1 (tc l t m=N-1 n m=N-1+p) kh nng c th tn ti sai s d on ln cng c th tn
ti bi v cc tn hiu thu c t qu trnh ly ca s bng 0 c d on t mt vi mu
cui cng khc khng ca tn hiu. Vi tn hiu m hu thanh,cc hiu ng tim nng tn ti
sai s d on ln u hoc cui khung tn hiu th hin r rng khi bt u chu k ca
pitch hoc rt gn vi cc im m=0 hoc m=N-1. i vi tn hiu m v thanh th hin
tng ny gn nh c loi b bi v khng c phn tn hiu no nhy cm (position
sensitive). Cc hin tng ny cng vi tn hiu ca s c minh ha trong cc hnh 3.8-
3.10.

Hnh 3.8 Minh ha trng hp sai s d on ln u khung vi tn hiu m hu thanh
Chng 3: Phn tch ting ni
37

Hnh 3.9 Minh ha trng hp sai s d on ln cui khung vi tn hiu m hu thanh

Hnh 3.10 Minh ha trng hp sai s d oan ln vi tn hiu m v thanh

Mc ch ca vic ly ca s l nhm chnh (taper) tn hiu gn cc im m=0 v m=N-1
lm ti thiu ha cc sai s cc vng bin ny.
Vi vic nh ngha khong tn hiu sau php ly qua ca s, chng ta c th vit biu thc
tnh sai s trung bnh bnh phng nh sau:
Chng 3: Phn tch ting ni
38
( )
1
2
0
N p
n n
m
e n
+
=
=

(3.32)
Khi
n
(i,k) c th c vit li l:
( ) ( ) ( ) ( )
1
0
, 1 , 0
N p
n n n
m
i k s m i s m k i p k p
+
=
=

(3.33)
Bng cch thay ch s biu thc trn c th c vit di dng:
( ) ( ) ( )
( )
( )
1
0
, 1 , 0
N i k
n n n
m
i k s m s m i k i p k p

=
= +

(3.34)
Ta thy biu thc (3.34) l mt hm ch ph thuc vo hiu i-k ch khng phi ph thuc
hai bin s c lp i v k. Do , hm covariance
n
(i,k) tr thnh hm t tng quan:

( ) ( )
( ) ( )
( )
( )
1
0
,
1 , 0
n n
N i k
n n
m
i k i k
s m s m i k i p k p

=
=
= +

(3.35)
Do hm t tng quan l hm i xng, tc l ( ) ( )
n n
k k = , biu thc tng ng ca
LPC c th c biu din l:

( ) ( ) ( )
1
1
p
n k n
k
i k a i i p
=
=

(3.36)
Nu biu din di dng ma trn chng ta c:

( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( )
( )
( )
( )
1
2
3

0 1 2 1 1

1 0 1 2 2

2 1 0 3 3

1 2 3 0
n n n n n
n n n n n
n n n n n
p n n n n n
a p
a p
a p
a p p p p
( ( (
( ( (

( ( (
( ( (
=
( ( (
( ( (
( ( (

(3.37)
Trong cng thc trn, ma trn cc thnh phn t tng quan l mt ma trn Toeplitz (ma
trn i xng vi cc thnh phn ng cho chnh bng nhau), do vic gii h phng
trnh trn d dng thc hin c bng vic p dng cc thut ton tnh ton hiu qu bit.
Phng php s dng covariance l mt phng php khc vi phng php s dng hm
t tng quan cp trn. Phng php ny c nh khong m trn sai s trung bnh
bnh phng c tnh trong khong 0 m N-1 v s dng khung tn hiu trong khong
mt cch trc tip m khng thc hin php ly ca s.
Sai s trung bnh bnh phng khi c tnh l:
( )
1
2
0
N
n n
m
e m

=
=

(3.38)
V covariance c tnh bi:
( ) ( ) ( ) ( )
1
0
, 1 , 0
N
n n n
m
i k s m i s m k i p k p

=
=

(3.39)
Hoc bng cch i ch s:
Chng 3: Phn tch ting ni
39
( ) ( ) ( ) ( )
1
0
, 1 , 0
N i
n n n
m
i k s m s m i k i p k p

=
= +

(3.40)
thy rng vic tnh ton theo biu thc (3.40) lin quan n cc mu tn hiu s
n
(m) t
thi im m=-p n m=N-1-p khi i=p, v lin quan n cc mu s
n
(m+i-k) t thi im 0 n
thi im N-1. Do , khong tn hiu cn thit c th tnh ton hon thin l t s
n
(-p) n
s
n
(N-1). Ni mt cch khc, vic tnh ton cn n cc mu bn ngoi khong ti thiu sai s
gm s
n
(-p), s
n
(-p+1), , s
n
(-1).
Bng vic s dng khong tn hiu m rng tnh ton cc gi tr covariance
n
(i,k),
biu thc phn tch LPC dng ma trn c biu din nh sau:

( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( )
( )
( )
( )
1
2
3

1,1 1, 2 1, 3 1, 1, 0

2,1 2, 2 2, 3 2, 2, 0

3,1 3, 2 3, 3 3, 4 3, 0

,1 , 2 , 3 , , 0
n n n n n
n n n n n
n n n n n
p n n n n n
a p
a p
a
a p p p p p p
( ( (
( ( (

( ( (
( ( (
=
( ( (
( ( (
( ( (

(3.41)
Ma trn cc h s covariance l mt ma trn i xng (v
n
(i,k)=
n
(k,i)) tuy nhin
khng phi ma trn Toeplitz. Vic gii h phng trnh trn c th thc hin bng vic s
dng thut ton phn tch Cholesky. Trong thc t, m hnh phn tch LPC biu din dng
covariance y thng khng c s dng trong cc h thng nhn dng tn hiu ting
ni.
3.7. Phng php phn tch cepstral
Khi nim cepstrum c a ra bi Bogert, Healy v Tukey. Cepstrum c nh ngha
l bin Fourier ngc (IFT) ca l-ga-rt ln bin ph ca tn hiu. Ni cc khc,
cepstrum ca mt tn hiu vi thi gian ri rc c cho bi cng thc:
( ) ( )
1
log
2
j j
n n
c m S e e d

(3.42)
y, log|S
n
(e
j
)| l l-ga-rt ca ln bin (magnitude) ca FT tn hiu. Khi nim
(3.42) c th c m rng thnh cepstrum phc nh sau:
( ) ( )
1
log{S }
2
j j m
n n
c m e e d

(3.43)
Trong cng thc (3.43), log{S
n
(e
j
)} l l-ga-rt phc ca S
n
(e
j
) v c nh ngha nh
sau:

( ) ( ) ( ) ( )

log{S } log arg


j j j j
n n n n
S e e S e j S e

(
= = +

(3.44)
Gi s s(n)=s
1
(n)*s
2
(n), vi nh ngha cepstrum d dng thy rng ( ) ( ) ( )
1 2
c n c n c n = + .
Nh vy php ton vi cepstrum chuyn tch chp thnh php cng. Chnh iu ny lm
cho php phn tch cepstrum tr thnh mt cng c hu ch cho vic phn tch tn hiu ting
ni.
Tuy nhin cc cng thc (3.42)-(3.44) l cc nh ngha da trn cc cng thc ton hc.
cng thc c ngha trong cc phn tch thc t, chng ta phi xy dng cc cng thc m
Chng 3: Phn tch ting ni
40
vic tnh ton c th d dng thc hin c. V bin i Fourier ri rc (DFT) l phin bn
ly mu ca bin i Fourier vi thi gian ri rc (DTFT) ca mt dy chiu di c nh (tc
l S(k)=S(e
j2k/N
)), do IDFT v DFT c th c thay th tng ng bng IDTFT v DTFT.
( ) ( )
1
2 /
0
N
j kn N
n
S k s n e

=
=

(3.45)
( ) ( ) ( )

log arg X k S k j S k = + (

(3.46)
( ) ( )
1
2 /
0
1

N
j kn N
n
s n X k e
N

=
=

(3.47)
3.8. Mt s phng php xc nh tn s Formant
Formant ca tn hiu ting ni l mt trong cc tham s quan trng v hu ch c ng dng
rng ri trong nhiu lnh vc chng hn nh trong vic x l, tng hp v nhn dng ting ni.
Cc formant l cc tn s cng hng ca tuyn m (vocal tract), n thng c th hin
trong cc biu din ph chng hn nh trong biu din spectrogram nh l mt vng c nng
lng cao, v chng bin i chm theo thi gian theo hot ng ca b my pht m. S d
formant c vai tr quan trng v l mt tham s hu ch trong cc nghin cu x l ting ni
l v cc formant c th miu t c cc kha cnh quan trng nht ca ting ni bng vic
s dng mt tp rt hn ch cc c trng. Chng hn trong m ha ting ni, nu s dng cc
tham s formant biu din cu hnh ca b my pht m v mt vi tham s ph tr biu
din ngun kch thch, chng ta c th t c tc m ha thp n 2,4kbps.
Nhiu nghin cu v x l v nhn dng ting ni ch ra rng cc tham s formant l
ng c vin tt nht cho vic biu din ph ca b my pht m mt cch hiu qu. Tuy nhin
vic xc nh cc formant khng n gin ch l vic xc nh cc nh trong ph bin bi
v cc nh ph ca tn hiu ra ca b my pht m ph thuc mt cch phc tp vo nhiu
yu chng hn nh cu hnh b my pht m, cc ngun kch thch, ...
Cc phng php xc nh formant lin quan n vic tm kim cc nh trong cc biu
din ph, thng l t kt qu phn tch ph theo phng php STFT hoc m ha d on
tuyn tnh (LPC).
a) Xc nh formant t phn tch STFT
Cc phn tch STFT tng t v ri rc tr thnh mt cng c c bn cho nhiu pht
trin trong phn tch v tng hp tn hiu ting ni.
D dng thy STFT trc tip cha cc thng tin v formant ngay trong bin ph. Do ,
n tr thnh mt c s cho vic phn tch cc tn s formant ca tn hiu ting ni.
b) Xc nh formant t phn tch LPC
Cc tn s formant c th c c lng t cc tham s d on theo mt trong hai cch.
Cch th nht l xc nh trc tip bng cch phn tch nhn t a thc d on v da trn
cc nghim thu c quyt nh xem nghim no tng ng vi formant. Cch th hai l
s dng phn tch ph v chn cc formant tng ng vi cc nh nhn bng mt trong cc
thut ton chn nh bit.
Mt li im khi s dng phng php phn tch LPC phn tch formant l tn s
trung tm ca cc formant v bng tn ca chng c th xc nh c mt cch chnh xc
thng qua vic phn tch nhn t a thc d on. Mt php phn tch LPC bc p c chn
Chng 3: Phn tch ting ni
41
trc, th s kh nng ln nht c th c cc im cc lin hp phc l p/2. Do , vic gn
nhn trong qu trnh xc nh xem im cc no tng ng vi cc formant n gin hn cc
phng php khc. Ngoi ra, vi cc im cc bn ngoi thng c th d dng phn tch
trong phn tch LPC v bng tn ca chng thng rt ln so vi bng tn thng thng ca
cc formant tn hiu ting ni.
3.9. Mt s phng php xc nh tn s c bn
Tn s c bn F
0
l tn s giao ng ca dy thanh. Tn s ny ph thuc vo gii tnh v
tui. F
0
ca n thng cao hn ca nam, F
0
ca ngi tr thng cao hn ca ngi gi.
Thng vi ging ca nam, F
0
nm trong khong t 80-250Hz, vi ging ca n, F
0
trong
khong 150-500Hz. S bin i ca F
0
c tnh quyt nh n thanh iu ca t cng nh ng
iu ca cu. Cu hi t ra l lm th no xc nh tn c c bn (fundamental frequency).
Mt s phng php xc nh tn s c bn c th k n l: Phng php s dng hm t
tng quan, phng php s dng hm vi sai bin trung bnh; Phng php s dng b lc
o v hm t tng quan; Phng php x l ng hnh (homomophic).
a) S dng hm t tng quan
Hm t tng quan
n
(k) s t cc gi tr cc khi tng ng ti cc im l bi ca chu
k c bn ca tn hiu. Khi cc tn s c bn l tn s xut hin ca cc nh ca
n
(t). Bi
ton tr thnh bi ton xc nh chu k hm t tng quan.
b) S dng hm vi sai bin trung bnh (AMDF)
Nh cp nu dy s(n) tun hon vi chu k T th hm AMDF M
n
s trit tiu ti cc
gi tr t l bi ca s T. Do , chng ta ch cn xc nh hai im cc tiu gn nhau nht v
t c th xc nh c chu k ca dy v t suy ra tn s c bn.
c) S dng tc tr v khng - zero crossing rate
Khi xem xt cc tn hiu vi thi gian ri rc, mt ln qua im khng ca tn hiu xy ra
khi cc mu cnh nhau c du khc nhau. Do vy, tc qua im khng ca tn hiu l mt
o lng n gin ca tn s ca tn hiu. Ly v d, mt tn hiu hnh sin c tn s F
0
c
ly mu vi tn s F
s
s c F
s
/F
0
mu trong mt chu k. V mi chu k c hai ln qua im
khng nn tc trung bnh qua im khng l Z
n
=2F
0
/F
s
. Nh vy, tc qua im khng
trung bnh cho l mt cch nh gi tng i v tn s ca sng sin.
d) Phng php s dng STFT
T kt qu phn biu din Fourier ca tn hiu ting ni, d thy rng ngun kch thch ca
tn hiu m hu thanh c tng cng nhng nh nhn v cc nh ny xy ra cc im
l bi s ca tn s c bn. y chnh l nguyn l c bn ca mt trong cc phng php
xc nh tn s c bn.
Chng 3: Phn tch ting ni
42

Hnh 3.11 S nn tn s

Xt biu thc ph tch cc hi (harmonic) nh sau:

( ) ( )
1
K
j j r
n n
r
P e S e

=
=

(3.48)
Nu ly l-ga-rt ca biu thc (3.48), thu c ph tch cc hi trong thang l-ga-rt:

( ) ( )
1

2 log
K
j j r
n n
r
P e S e

=
=

(3.49)
Hm
( )

j
n
P e

trong cng thc (3.49) l mt tng ca K ph nn tn s ca |S
n
(e
j
)|. Vic
s dng hm trong cng thc (3.49) xut pht t nhn xt rng vi tn hiu m hu thanh,
vic nn tn s bi cc h s nguyn s lm cc hi ca tn s c bn trng vi tn s c bn.
vng tn s gia cc hi, c mt hi ca cc s tn s khc cng b nn trng nhau, tuy
nhin ch ti tn s c bn l c cng c. Hnh 3.11 minh ha nhn xt va nu.
e) S dng phn tch Cepstral
Trong phn tch cepstral ngi ta quan st thy rng, vi tn hiu m hu thanh, c mt
nh nhn ti chu k c bn ca tn hiu. Tuy nhin vi tn hiu m v thanh th nh nhn
ny khng xut hin. Do , phn tch cepstral c th c s dng nh mt cng c c bn
dng xc nh xem mt on tn hiu ting ni l tn hiu m v thanh hay hu thanh, v
xc nh chu k c bn ca tn hiu m hu thanh. Phng php s dng phn tch cepstral
c lng tn s c bn kh n gin. Trc ht cc cepstrum c tnh ton v tm kim
Chng 3: Phn tch ting ni
43
nh nhn trong mt khong ln cn ca chu k phng on. Nu nh cepstrum ti ln
hn mt ngng nh trc th tn hiu ting ni a vo c kh nng ln l tn hiu m hu
thanh v v tr nh l mt c lng chu k tn hiu c bn (cng tc l xc nh c tn
s c bn).
Hnh 3.12 minh ha vic s dng phng php phn tch cepstral xc nh tn hiu m
v thanh v hu thanh cng vi xc nh tn s c bn ca m hu thanh. Pha bn tri l dy
cc l-ga ph ngn hn (cc ng thay i rt nhanh theo thi gian), pha bn phi l cc dy
cepstra tng ng c tnh ton t cc l-ga ph pha bn tai tri. Cc dy l-ga ph v
cepstra tng ng l cc on lin tip chiu di 50ms thu c t hm ca s dch 12,5ms
mi bc (ngha l dch khong 100 mu tn s ly mu 800mu/giy). T hnh v, chng
ta thy cc dy 1-5, ca s tn hiu ch bao gm tn hiu m v thanh (khng xut hin nh,
s thay i ph rt nhanh v xy ra ngu nhin khng c cu trc chu k) trong khi cc dy 6
v 7 bao gm c tn hiu m v thanh v hu thanh. Cc dy 8-15 ch bao gm tn hiu m
hu thanh. D dng thy nh cepstrum ti tn s ng vi 11-12ms tn hiu m hu thanh. V
nh vy, tn s ca nh l mt c lng chnh xc tn s c bn trong khong tn hiu hu
thanh.

Hnh 3.12 L-ga-rt cc thnh phn hi trong ph tn hiu

Chng 3: Phn tch ting ni
44
3.10. Bi thc hnh phn tch ting ni
S dng my tnh c nhn v phn mm Matlab (hoc cc ngn ng lp trnh khc) thc
hin cc cng vic sau:
Vi cng mt ni dung thng tin, cc thnh vin trong nhm ln lt pht m (c/ni) v
ghi m. Lu tp nh dng th (*.wav).
S dng phn mm Matlab (hoc cc ngn ng lp trnh khc) v kin thc hc trong
chng ny:
Xc nh tn s c bn
Xc nh tn s ca Formant u tin ca mi thnh vin
Lp bn phn b ca cc nguyn m trong ting Vit.


Chng 4: Tng hp ting ni
45
Chng 4: Tng hp ting ni
4.1. M u
Trc y khi nim "tng hp ting ni" thng c dng ch qu trnh to m thanh
ting ni mt cch nhn to t my da theo nguyn l m phng c quan pht m ca ngi.
Tuy nhin ngy nay, cng vi s pht trin ca khoa hc cng ngh, khi nim ny c
m rng bao gm c qu trnh cung cp cc thng tin dng ting ni t my trong cc bn
tin c to dng mt cch linh ng ph hp cho nhu cu no . Cc ng dng ca cc
h thng tng hp ting ni ngy nay rt rng ri, t vic cung cp cc thng tin dng ting
ni, cc my c cho ngi m, nhng thit b h tr cho ngi gp kh khn trong vic giao
tip,...
4.2. Cc phng php tng hp ting ni
4.2.1 Tng hp trc tip
Mt phng php n gin thc hin vic tng hp cc bn tin l phng php tng hp
trc tip trong cc phn ca bn tin c chp ni bi cc phn (fragment) n v ca
ting ni con ngi. Cc n v ting ni thng l cc t hoc cc cm t c lu tr v
bn tin ting ni mong mun c tng hp bng cch la chn v chp ni cc n v thch
hp. C nhiu k thut trong vic tng hp trc tip ting ni v cc k thut ny c phn
loi theo kch thc ca cc n v dng chp ni cng nh nhng loi biu din tn hiu
dng chp ni. Cc phng php ph bin c th kt n l: phng php chp ni t,
chp ni cc n v t con (m v sub-word unit), chp ni cc phn on dng sng tn hiu.
a) Phng php tng hp trc tip n gin
Phng php n gin nht to cc bn tin ting ni l ghi v lu tr ting ni ca con
ngi theo cc n v t ring l khc nhau v sau chn pht li cc t theo th t mong
mun no . Phng php ny c a vo s dng trong h thng in thoi ca nc
Anh t nhng nm 36 ca th k trc, t nhng nm 60 ca th k trc thng c dng
trong mt s h thng thng bo cng cng, v ngy nay vn cn c mt nhiu h thng
qun l in thoi trn th gii. H thng phi lu tr y cc thnh phn ca cc bn tin
cn thitt phi ti to v lu trong mt b nh. B tng hp ch lm nhim v kt ni cc n
v yu cu cu thnh bn tin li vi nhau theo mt th t no m khng phi thay i hay
bin i cc thnh phn ring r.
Cht lng ca bn tin ting ni c tng hp theo phng php ny b nh hng bi
cht lng ca tnh lin tc ca cc c trng m hc (bin ph, bin , tn s c bn, tc
ni) ca cc n v c chp ni. Phng php tng hp ny t ra hiu qu khi cc bn tin
c dng mt danh sch chng hn nh mt dy s c bn, hoc cc khi bn tin thng xut
hin mt v tr nht nh trong cu. iu ny d hiu bi v iu cho php d dng m
bo rng bn tin c pht ra c tnh t nhin v mt thi gian v cao . Khi c yu cu mt
cu trc cu c bit no m trong cc t thay th nhng v tr nht nh trong cu th
cc t phi c ghi li ng nh th t ca n trong cu nu khng n s khng ph
hp vi ng iu ca cu. Chng hn vi cc dy s c bn cng cn thit phi ghi li chng
hai dng: mt tng ng vi v tr cui cu v mt dng khng. iu ny l v cu trc pitch
ca mi n v ting ni thay i ty theo v tr ca t trong cu. Nh vy, qu trnh bin son
Chng 4: Tng hp ting ni
46
l mt qu trnh rt tn thi gian v cng sc. Ngoi ra vic chp ni trc tip cc n v ting
ni gp rt nhiu kh khn trong vic din t s nh hng t nhin gia cc t, cng nh
ng iu v nhp iu ca cu. Mt hn ch na phi k n l kch thc ca b nh cho cc
ng dng vi s lng cc bn tin ln l rt ln.
Yu cu b nh lu tr ln c th c phn no gii quyt bng vic s dng phng
php m ha tc thp cho cc n v ting ni trc khi thc hin vic lu tr. Tuy nhin
c phng php s dng lu tr trc tip hoc m ha ca cc n v ln (t, cm t) ca
ting ni, s lng bn tin c th tng hp c rt hn ch. tng s lng bn tin c th
tng hp c, cc n v t c th c chia nh hn thnh n v t con, diphone,
demisyllable, syllable... c ghi v lu tr. Tuy nhin khi n v ting ni cng c chia
nh th cht lng bn tin tng hp c cht lng cng b gim.
Hnh 4.1 minh ha s so snh spectrogram ca cu tng hp c theo phng php tng
hp trc tip n gin v bn tin nguyn thy.

Hnh 4.1 So snh kt qu t bn tin tng hp trc tip v bn tin nguyn thy

b) Phng php tng hp trc tip t cc phn on dng sng
Nh cp phn trn, phng php tng hp trc tip n gin gp phi hn ch trong
vic khi phc tc v tnh t nhin (nhn, nhp, ng iu) ca bn tin c tng hp. Vn
ny c th c gii quyt bng cch s dng phng php tng hp t cc phn on
dng sng hay cn gi l phng php tng hp chng v thm cc on sng theo di
pitch. Xem xt bi ton chp ni hai phn on ca dng sng ca tn hiu ca nguyn m.
Chng ta thy rng s khng lin tc trong dng sng tng hp s c gim nh ti thiu
nu vic chp ni xy ra cng v tr ca mt chu k glottal ca c hai phn on. V tr ny
thng l v tr tng ng vi vng c bin tn hiu nh nht khi p ng tuyn m vi
xung glottal hin ti c s suy gim ln v ch ngay trc mt xung tip theo. Ni cch khc,
hai phn on tn hiu c chp ni theo kiu ng b pitch (pitch-synchronous manner).
Chng 4: Tng hp ting ni
47
Phng php ph bin thc hin vic ny l phng php TD-PSOLA (Time domain Pitch
Synchronous Overlap Add).
TD-PSOLA thc hin vic nh du cc v tr tng ng vi s ng li ca dy thanh
(tc l xung pitch) trong dng sng tn hiu ting ni. Cc v tr nh du ny c s dng
to ra cc phn on ca s ca dng sng tn hiu cho mi chu k. Vi mi chu k, hm
ca s phi c chnh trng vi trung tm ca vng c bin tn hiu cc i v hnh dng
ca hm ca s phi c chn thch hp. Ngoi ra, di hm ca s phi di hn mt chu
k nhm to ra mt s chng ln nh gia cc ca s tn hiu cnh nhau.
Hnh 4.2 minh ha nguyn l lm vic ca phng php TD-PSOLA trong s dng
hm ca s Hanning.

Hnh 4.2 Nguyn l phng php TD-PSOLA

T minh ha, chng ta thy rng, bng cch chp ni dy cc phn on ca s tn hiu
sng theo cc v tr tng i cho trc theo cc im du pitch phn tch, chng ta c th
ti to mt cch kh chnh xc bn tin theo mong mun. Ngoi ra, bng cch thay i cc v
tr tng i v s lng cc im du pitch, chng ta c th lm thay i pitch v thi gian
ca bn tin c tng hp.
4.2.2 Tng hp ting ni theo Formant
Phng php tng hp theo Formant l phng php tng hp ch thc u tin c
pht trin v l phng php tng hp ph bin cho n tn nhng nm u ca thp k $80$.
Phng php tng hp theo Formant cn c gi l phng php tng hp theo lut. N s
dng cc phng php m-un (modular), da trn m hnh (model-based), mi quan h m
thanh-m tit gii cc bi ton tng hp ting ni. Trong phng php ny, m hnh ng
m thanh c s dng mt cch t bit sao cho cc thnh phn iu khin ca ng d dng
Chng 4: Tng hp ting ni
48
c lin h vi cc tnh cht ca mi quan h m thanh-m tit (acoustic-phonetic) v c th
quan st c mt cch d dng.
Hnh 4.3 m t s tng qut mt h thng tng hp theo formant. Nguyn l tng qut
ca h thng c m t nh sau. m thanh c pht ra t mt ngun. i vi cc nguyn
m v cc ph m hu thanh th ngun m ny c th c to ra hoc y bng mt hm
tun hon trong min thi gian hoc bng mt dy p ng xung a qua mch lc tuyn tnh
m phng khe thanh (glottal LTI filter). i vi cc m v thanh th ngun m ny c to
ra t mt b pht nhiu ngu nhin. i vi cc m tc th ngun c bn ny c to ra bng
cch kt hp ngun cho m hu thanh v ngun cho m v thanh. Tn hiu m thanh t ngun
m c bn c a vo m hnh tuyn m (vocal tract). ti to tt c cc formant, m
phng khoang ming v khoang mi c xy dng song song ring bit. Do , khi tn hiu
i qua h thng s i qua m hnh khoang ming, nu c yu cu v cc m mi th cng i
qua h thng m hnh khoang mi. Cui cng kt qu cc thnh phn m thanh to ra t cc
m hnh khoang ming v mi c kt hp li v c a qua h thng pht x, h thng
ny m phng cc c tnh lan truyn v c tnh ti ca mi v mi.

Hnh 4.3 S phng php tng hp theo formant

Theo l thuyt mch lc, mt formant c th c to ra bng cc s dng mt mch lc
IIR bc hai vi hm truyn:

( )
1 2
1 2
1
1
H z
a z a z

=

(4.1)
Trong hm truyn t c th phn tch thnh:
( )
( )( )
1 1
1 2
1
1 1
H z
p z p z

=

(4.2)
Chng ta bit rng, xy dng mch lc vi cc h s a
1
v a
2
l thc th cc im cc
phi c dng l cp lin hp phc. Cn ch rng mt b lc bc hai nh trn s c th
ph vi hai formant, tuy nhin ch c mt trong hai nm phn tn s dng. Do , chng ta
c th coi b lc trn to ra mt formant n l c ch. Cc im cc c th quan st c
trn th, trong ln bin ca cc im cc quyt nh bng tn v bin ca cng
hng. ln bin cng nh th cng hng cng phng, ngc li, ln bin cng
ln th cng hng cng nhn.
Chng 4: Tng hp ting ni
49
Nu biu din cc im cc trong ta cc vi pha v bn knh r v ch n nhn xt
cp im cc l lin hp phc chng ta c th vit hm truyn t trong cng thc (4.1) nh
sau:
( )
( )
2 2
1
1 2 os
H z
r c r z

=
+
(4.3)
T y chng ta thy cng ta c th to ra mt formant vi bt c tn s mong mun no
bng vic s dng trc tip gi tr thch hp ca . Tuy vy vic iu khin bng tn mt cch
trc tip kh khn hn. V tr ca formant s thay i hnh dng ca ph do mt mi quan
h chnh xc cho mi trng hp l khng th t c. Cng cn ch rng, nu hai im
cc gn nhau, chng s c nh hng n vic kt hp thnh mt nh cng hng duy nht
v iu ny li gy kh khn cho vic tnh ton bng tn. Thc nghim cho thy mi lin h
gia bng tn chun ha ca formant v bn knh ca im cc c th xp x hp l bi:
( )

2ln B r = (4.4)
Khi ta c th biu din hm truyn t theo hm ca tn s chun ha

F v bng tn
chun ha

B ca formant nh sau:
( )
( )

2 1 2 2
1

1 2 os 2
B B
H z
e c F z e z

=
+
(4.5)
y, cc tn s chun ha

F v bng tn chun ha

B c th xc nh tng ng bng
cch chia F v B cho tn s ly mu F
s
.
c th to ra nhiu formant chng ta c th thc hin bng mt b lc m hm
truyn t l tch ca mt s hm truyn t bc hai. Ni mt cch khc, hm truyn cho
tuyn m (vocal tract) c dng:
( ) ( ) ( ) ( ) ( )
1 2 3 4
H z H z H z H z H z = (4.6)
Trong H
i
(z) l hm ca tn s F
i
v bng tn B
i
ca formant th i.
Tng ng biu thc quan h u vo u ra trong min thi gian c dng:
( ) ( ) ( ) ( ) ( )
1 2 8
1 2 ... 8 y n x n a y n a y n a y n = + + + + (4.7)
Mt cch tng t, chng ta c th xy dng h thng m phng khoang mi. Cc biu
thc (4.6) v (4.7) biu din k thut tng hp formant theo s ni tip hay cn gi l s
cascade.
Mt k thut khc l tng hp formant song song. Phng php tng hp formant song
song m phng mi formant ring r. Ni cch khc, mi m hnh c mt hm truyn H
i
(z)
ring r. Trong qu trnh to tn hiu ting ni cc ngun tn hiu c a vo cc m hnh
mt cch ring r. Sau , cc tn hiu t cc m hnh y
i
(n) c tng hp li.
( ) ( ) ( )
1 2
... y n y n y n = + + (4.8)
Hnh 4.4 minh ha cu hnh tng qut ca phng php tng hp ni tip v song song.
Chng 4: Tng hp ting ni
50

Hnh 4.4 Cc cu hnh ca phng php tng hp nhiu formant

Phng php tng hp theo s ni tip c li im l vi mt tp cc gi tr formant
cho trc, chng ta c th d dng xy dng cc hm truyn t v biu thc quan h u vo
u ra (cng thc vi sai - difference equation). Vic tng hp ring r cc formant trong
phng php tng hp song song cho php chng ta xc nh mt cch chnh xc tn s ca
cc formant.
Mc d l mt phng php tng hp n gin v thng mang li tn hiu m thanh r,
phng php tng hp theo formant kh t c tnh t nhin ca tn hiu ting ni. iu
ny l do m hnh ngun v m hnh chuyn i b n gin ha qu mc v b qua
nhiu yu t ph tr gp phn to ra c tnh ng ca tn hiu.
B tng hp Klatt
B tng hp Klatt l mt trong cc b tng hp tin ni da trn formant phc tp nht
c pht trin. S ca b tng hp ny c trnh by trong hnh 4.5 trong c s dng
c cc h thng cng hng song song v ni tip.
Trong s cc khi R
i
tng ng vi cc b to tn s cng hng formant th i; cc hp
A
i
iu khin bin tn hiu tng ng. B cng hng c thit lp lm vic tn s
10kHz vi 6 formant chnh c s dng.
Cn ch rng, trong thc t cc b tng hp formant thng s sng tn s ly mu
khong 8kHz hoc 10kHZ. iu ny khng hn bi mt l do no c bit lin quan n
nguyn tc v cht lng tng hp m bi v s hn ch v khng gian lu tr, tc x l
v cc yu cu u ra khng cho php thc hin vi tc ly mu cao hn. Mt im khc
cng cn ch l, cc nghin cu chng minh rng ch c ba formant u tin l
phn bit tn hiu m thanh, do vic s dng 6 formant th cc formant bc cao n gin
c s dng tng thm tnh t nhin cho tn hiu tng hp c.
Chng 4: Tng hp ting ni
51

Hnh 4.5 S khi b tng hp Klatt


4.2.3 Tng hp ting ni theo phng php m phng b my pht m
Mt cch hin nhin, tng hp ting ni th chng ta cn tm mt cch no m phng
b my pht m ca chng ta. y cng l nguyn l ca cc "my ni" c in m ni ting
trong s l my do Von Kempelen ch to. Cc b tng hp ting ni c in theo nguyn
l ny thng l cc thit b c hc vi cc ng, ng thi, ... hot ng ta h cc dng c m
nhc, tuy nhin vi mt cht hun luyn c th dng to ra tn hiu ting ni nhn bit
c. Vic iu khin hot ng ca my l nh con ngi theo thi gian thc, iu ny
Chng 4: Tng hp ting ni
52
mang li nhiu thun li cho h thng kha cnh con ngi c th s dng cc c ch chng
bn nh thng qua phn hi iu khin v bt chc qu trnh to ting ni t nhin. Tuy
nhin, ngy nay vi nhu cu ca cc b tng hp phc tp hn, cc c my c in r rng l
li thi khng th p ng c.
Cng vi s hiu bit ca con ngi v b my pht m c nng cao, cc b tng hp s
dng nguyn l m phng b my pht m ngy cng phc tp v hon thin hn. Cc hnh
dng ng phc tp c xp x bng mt lot cc ng n gin nh hn. Vi m hnh cc ng
n gin, v chng ta bit c cc c tnh truyn m ca n, chng ta c th s dng xy
dng cc m hnh b my pht m tng qut phc tp.
Mt u in ca phng php tng hp m phng b my pht m l cho php to ra mt
cch t nhin hn to ra ting ni. Tuy nhin, phng php ny cng gp phi mt s kh
khn. Th nht l vic quyt nh lm th no c c cc tham s iu khin t cc
yu cu tn hiu cn tng hp. R rng, kh khn ny cng gp phi trong cc phng php
tng hp khc. Trong hu ht cc phng php tng hp khc, chng hn cc tham s
formant c th tm c mt cch trc tip t tn hiu ting ni thc, chng ta ch n gin
ghi m li ting ni v tnh ton ri xc nh chng. Cn trong phng phng php m
phng b my pht m chng ta s gp kh khn hn v cc tham s v b my pht m ng
n khng th xc nh t vic ghi li tn hiu thc m phi thng qua cc o lng thng qua
chng hn nh X-ray, MRI... Kh khn th hai l vic cn bng gia vic xy dng mt m
hnh m phng chnh xc cao nht ging vi b my pht m sinh hc ca con ngi v mt
m hnh thc tin d thit k v thc hin. C hai kh khn ny cho n nay vn c coi l
thch thc vi cc nh nghin cu. V y cng chnh l l do m cho n nay c rt t cc h
thng tng hp theo nguyn l m phng b my pht m c cht lng so vi cc b tng
hp theo nguyn l khc.
4.3. H thng tng hp ch vit sang ting ni
Vic chuyn i t ch vit sang ting ni (TTS) l mc tiu y tham vng v vn ang
tip tc l tm im ch ca cc nh nghin cu pht trin. TTS c mt nhiu ng dng
phc v cuc sng. Chng hn nh vic cc ng dng truy cp email qua thoi, cc ng dng
c s d liu cho cc dch v h tr ngi m... Mt h thng TTS in hnh c s khi
vi cc thnh phn c minh ha trong hnh 4.6.

Hnh 4.6 S khi mt h thng TTS

Chng 4: Tng hp ting ni
53
T minh ha, chng ta thy rng, h thng TTS c th c trng nh mt qu trnh phn
tch-tng hp 2-giai on. Giai on mt ca qu trnh thc hin vic phn tch ch vit
xc nh cu trc ngn ng n trong . Ch vit u vo thng bao gm cc cm t vit tt,
cc s La M, ngy thng, cng thc, cc du cu...Giai on phn tch ch vit phi c kh
nng chuyn i dng ch vit u vo thnh mt dng chun chp nhn c s dng cho
giai on sau. Cc m t ngn ng dng tru tng ca d liu thu c giai on ny c
th bao gm mt dy phoneme v cc thng tin khc, chng hn nh cu trc nhn, cu trc
c php...Cc m t ny c chuyn i thnh mt bng ghi m tit nh s gip ca mt
t in pht m v cc lut pht m km theo. Giai on th hai thc hin vic tng hp xy
dng dng sng tn hiu da trn cc tham s thu c t giai on trc .
C qu trnh phn tch v tng hp ca mt h thng TTS lin quan n mt lot cc hot
ng x l. Hu ht cc h thng TTS hin i thc hin cc hot ng x l c minh ha
theo kin trc m-un nh trong hnh 4.7.
Hot ng ca s khi c th s lc m t nh sau. Khi dng d liu ch vit c
a vo, mi m-un trch cc thng tin u vo hoc thng tin t cc m-un khc lin quan
n ch vit, v to ra cc cc thng tin u ra mong mun cho vic x l cc m-un tip
theo. Vic trch chuyn c thc hin cho n khi dng tn hiu tng hp cui cng c to
ra. Qu trnh x l v truyn thng tin t m-un ny n m-un khc thng qua mt "ng
c" (engine) x l ring bit. Engine x l iu khin dy cc hot ng c thc thi, v lu
tr mi thng tin dng cu trc d liu thch hp.
Chng 4: Tng hp ting ni
54

Hnh 4.7 S khi kin trc m-un ca mt h thng TTS hin i

a) Phn tch ch vit
Chng ta bit rng, ch vit bao gm cc k t ch v s, cc khong trng, v c th mt
lot cc k t c bit khc. Nh vy bc u tin trong vic phn tch ch vit l vic tin
x l ch vit u vo (bao gm thay th ch s, cc ch vit tt bng dng vit y ca
Chng 4: Tng hp ting ni
55
chng) chuyn chng thnh mt dy cc t. Qu trnh tin x l thng thng cn pht
hin v nh du cc v tr ngt qung ca cu v cc thng tin v nh dng vn bn thch
hp khc chng hn nh ngt on...Cc m-un x l ch vit tip theo s thc hin vic
chuyn dy t thnh cc m t ngn ng. Mt trong cc chc nng quan trng ca cc khi
ny l xc nh pht m tng ng ca cc t ring l. Trong cc ngn ng nh ngn ng
ting Anh, cc quan h gia cc nh vn ca cc t v dng ghi m v (phonemic
transcription) tng ng l mt quan h cc k phc tp. Ngoi ra, mi quan h ny cn c
th khc nhau vi cc t khc nhau c cng cu trc, v d nh pht m ca cm "ough" trong
cc t "through", "though", "bough", "rough" v "cough".
Nh cp khi qut trong phn trn, pht m ca t thng c xc nh nh vic s
dng tng hp ca mt t in pht m v cc lut pht m km theo. Trong cc h thng
TTS trc khia, nhn mnh trong cc pht m xc nh c tun theo lut v bng cch s
dng mt t in cc ngoi l nh cho cc t chung vi cch pht m bt quy tc (chng hn
nh "one", "two", "said", ...). Tuy nhin ngy nay vi s sn c ca b nh my tnh vi gi
thnh r, thng vic xc nh pht m c hon thnh bng cch s dng mt t in pht
m rt ln (c th gm hng vi chc ngn t) m bo rng t bit c pht m mt
cch chnh xc. Mc d vy, cc lut pht m vn cn thit gii quyt vn ny sinh vi
cc t khng bit v cc t vng mi c lin tc thm vo ngn ng, v cng nh khng
th da hon ton vo vic thm vo tt c cc t vng l cc danh t ring trong b t in.
Vic xc nh pht m ca t c th c thc hin mt cch d dng nu cu trc, hay cn
gi l hnh thi hc ngn ng (morphology), ca t c bit trc. Hu ht cc h thng
TTS bao gm c cc phn tch hnh thi ngn ng. Phn tch ny xc nh dng gc (root
form ca mi t), v d dng gc ca "gives" l "give", v trnh s cn thit phi thm c
dng suy ra t dng gc vo trong t in. Mt s phn tch c php ca ch vit cng c th
cn c thc hin nhm xc nh chnh xc pht m ca cc t nht nh no . Chng hn,
trong ting Anh t "live" c pht m khc nhau ph thuc vo n ng vai tr l mt ng
t hay mt tnh t. Cc pht m ca t chng ta xc nh l cc pht m ca cc t khi chng
c ni ring r. Do , mt s iu chnh cn c thc hin kt hp cc hiu ng m
tit (phonetic) xy ra trn vng bin gia cc t, nhm ci thin tnh t nhin ca ting ni
tng hp c.
Ngoi vic xc nh pht m ca dy t, giai on phn tch ch vit cng phi thc hin
vic xc nh cc thng tin lin quan n cch m ch vit s c ni. Thng tin ny, bao
gm vic phn tit tu, du nhn t (mc t), v mu cc ng iu ca cc t khc nhau. Cc
thng tin ny s c s dng to m iu cho ting ni c tng hp. Cc nh du cho
du nht t c th c thm vo cho mi t trong t in, nhng cc lut cng s cn gn
du nht t cho cc t bt k khng tm thy trong t in. Vi mt s t, chng hn nh t
"permit", v c bn c du nht trn cc m tit khc nhau ph thuc vo vic chng c s
dng nh mt danh t hay mt ng t. V do , cc thng tin v ng php cng cn thit
nhm gn cu trc nhn mt cch chnh xc. Kt qu ca mt phn tch c php cng c th
c s dng nhm cc t thnh cc cm t m iu, v t quyt nh cc t no s
nhn ging sao cho mu nhn ging c th c gn cho dy t. Trong khi cu trc c php
cung cp cc u mi hu ch cho vic nhn ging v phn tit tu (v t to m iu),
trong nhiu trng hp, m iu biu hin thc c th khng t c nu khng thc s hiu
Chng 4: Tng hp ting ni
56
ngha ca ch vit. Mc d mt s nh hng ng ngha c s dng, cc phn tch ng
ngha v thc dng y l vt qu cc kh nng ca cc h thng TTS hin ti.
b) Tng hp ting ni
Cc thng tin c trch t cc phn tch ch vit c s dng to ra m iu ca cc
n v ting ni, bao gm c cu trc thi gian, mc nhn mnh ton b v tn s c bn.
M-un cui cng ca h thng TTS s thc hin vic to m thanh ca tn hiu ting ni
bng cch u tin chn cc n v tng hp thch hp s dng, v sau thc hin vic
tng hp cc n v ny vi nhau theo thng tin v m iu bit. Vic tng hp c th
c thc hin bng mt trong cc phng php cp phn trn.
4.4. Bi thc hnh tng hp ting ni
S dng phng php tng hp trc tip n gin
- S dng my tnh c nhn v phn mm Matlab (hoc cc ngn ng lp trnh khc) xy
dng mt h thng thng bo im xe but cng cng.
- S dng my tnh c nhn v phn mm Matlab (hoc cc ngn ng lp trnh khc) xy
dng mt h thng thng bo s th t khch hng n lt c phc v ti mt im giao
dch ngn hng.

Chng 5: Nhn dng ting ni
57
Chng 5: Nhn dng ting ni
5.1. M u
Nhu cu v nhng thit b (my) c th nhn bit v hiu c ting ni c ni bi bt
k ai, trong bt k mi trng no tr thnh mt c mun tut bc ca con ngi cng
nh cc nh nghin cu v cc d n nghin cu v nhn dng ting ni trong sut gn mt
th k qua. Cho n nay, mc d chng ta t c nhng bc tin di trong vic hiu
c qu trnh to tn hiu ting ni v a ra nhiu k thut phn tch ting ni, v thm ch
chng ta t c nhiu tin b trong vic xy dng v pht trin nhiu h thng nhn
dng tn hiu ting ni quan trng, chng ta vn cn ang qu xa mc tiu t ra l c th
xy dng c nhng c my c th giao tip mt cch t nhin vi con ngi. Trong
chng ny, trc ht chng ta s xem xt li lch s pht trin ca lnh vc nghin cu nhn
dng ting ni, sau tm hiu s b mt h thng nhn dng tn hiu ting ni tng qut v
mt s phng php hin ang c s dng trong cc h thng nhn dng tn hiu ting
ni cng vi u nhc im ca n.
5.2. Lch s pht trin cc h thng nhn dng ting ni
Nghin cu v nhn dng ting ni l mt lnh vc nghin cu v ang din ra c
gn mt th k. Trong sut qu trnh , chng ta c th phn loi cc cng ngh nhn dng
thnh cc th h nh sau:
Th h 1: Th h ny c nh mc bt u t nhng nm 30 cho n nhng nm 50.
Cng ngh ca th h ny l cc phng thc ad hoc nhn dng cc m, hoc cc b t
vng vi s lng nh ca cc t tch bit.
Th h 2: Th h th hai bt u t nhng nm 50 v kt thc nhng nm 60. Cng ngh
ca th h ny s dng cc cc phng php acoustic-phonetic nhn dng cc phonemes,
cc m tit hoc cc t vng ca cc s.
Th h 3: Th h ny s dng cc bin php nhn dng mu nhn dng tn hiu ting
ni vi cc b t vng va v nh ca cc t tch bit hoc dy t c lin kt vi nhau, bao
gm c vic s dng b LPC nh l mt phng php phn tch c bn; s dng cc o lng
khong cch LPC cho im s tng ng ca cc mu; s dng cc gii php lp trnh
ng cho vic chnh thi gian; s dng nhn dng mu cho vic phn hoch cc mu thnh
cc mu tham chiu nht qun, s dng phng php m ha lng t ha vc-t gim
nh d liu v tnh ton. Th h th ba bt u t nhng nm 60 n nhng nm 80.
Th h 4: Th h th t bt u t nhng nm 80 n nhng nm 00. Cng ngh ca th
h ny s dng cc phng php thng k vi m hnh Markov n (HMM) cho vic m
phng tnh cht ng v thng k ca tn hiu ting ni trong mt h thng nhn dng lin tc;
s dng cc phng php hun luyn lan truyn xui-ngc v phn on K-trung bnh
(segmental K-mean); s dng phng php chnh thi gian Viterbi; s dng thut ton
tng ng ti da (ML) v nhiu tiu chun cht lng cng cc gii php ti u ha cc
m hnh thng k; s dng mng n-ron c lng cc hm mt xc sut c iu kin;
s dng cc thut ton thch nghi thay i cc tham s gn vi hoc tn hiu ting ni hoc
vi m hnh thng k nng cao tnh tng thch gia m hnh v d liu nhm tng tnh
chnh xc ca php nhn dng.
Chng 5: Nhn dng ting ni
58
Th h 5: Chng ta ang chng kin s pht trin ca lp cng ngh nhn dng ting ni
th h th nm. Cng ngh th h ny s dng cc gii php x l song song tng tnh tn
cy trong cc quyt nh nhn dng; kt hp gia HMM v cc phng php acoustic-
phonetic pht hin v sa cha nhng ngoi l ngn ng; tng tnh chc chn (chn chn -
robustness) ca h thng nhn dng trong mi trng c nhiu; s dng phng php hc
my xy dng cc kt hp ti u ca cc m hnh.
Cng cn ch rng, vic phn chia cc giai on ch mang tnh tng i v mc thi
gian. iu ny d hiu bi v cc th h cng ngh khng phn tch rch ri nhau m hu nh
cc tng ct li ca mi giai on li c thai nghn t giai on trc . Cc giai on
c phn chia ch nhm ch ra rng trong giai on nhiu kt qu nghin cu lin quan
n cng ngh ca giai on oc a ra v tr thnh tiu chun cho hu ht cc h thng
nhn dng ca thi k .
5.3. Phn loi cc h thng nhn dng ting ni
Ty theo cc cch nhn m chng ta cc cch phn loi cc h thng nhn dng ting ni
khc nhau. Xt theo kha cnh n v ting ni c s dng trong cc h thng, th cc h
thng nhn dng ting ni c th c phn thnh hai loi chnh. Loi th nht l cc h
thng nhn dng t ring l, trong cc biu din t phn tch n l c nhn dng. Loi
th hai l cc h thng nhn dng lin tc trong cc cu lin tc c nhn dng. H thng
nhn dng ting ni lin tc cn c th chia thnh lp nhn dng vi mc ch ghi chp
(transcription) v lp vi mc ch hiu tn hiu ting ni. Lp vi mc nh ghi chp c mc
tiu nhn dng mi t mt cch chnh xc. Lp vi mc ch hiu, cng cn c gi l lp
nhn dng ting ni hi thoi, tp trung vo vic hiu ngha ca cc cu thay v vic nhn
dng cc t ring bit. Trong cc h thng nhn dng ting ni lin tc, iu quan trng l
phi s dng cc kin thc ngn ng phc tp. Chng hn nh vic ng dng cc lut v ng
php, cc lut quy nh v vic t chc dy cc t trong cu, l mt v d.
Theo cch nhn khc, cc h thng nhn dng ting ni c th c phn chia thnh cc h
thng nhn dng khng ph thuc vo ngi ni (speaker-independent) v h thng nhn
dng ph thuc vo ngi ni (speaker-dependent). H thng nhn dng c lp vi ngi
ni c kh nng nhn dng ting ni ca bt c ai. Trong khi , i vi h thng nhn dng
ph thuc ngi ni, cc mu/m hnh tham kho cn phi thay i cp nht mi ln ngi
ni thay i. Mc d vic nhn dng c lp vi ngi ni kh hn rt nhiu so vi vic nhn
dng ph thuc ngi ni, nhng vic pht trin cc phng nhn dng c lp l c bit
quan trng nhm m rng phm vi s dng ca cc h thng nhn dng.
Ngoi ra, cc h thng ting ni cng c th phn chia lm cc nhm sau: cc h thng
nhn dng ting ni t ng, cc h thng nhn dng ting ni lin tc, v cc h thng x l
ngn ng t nhin (NLP - Natural Language Processing). Cc h thng nhn dng ting ni t
ng, nh tn m t, l cc h thng nhn dng m khng cn thng tin u vo ca ngi s
dng b sung vo. Cc h thng nhn dng ting ni lin tc, nh cp phn trn, l
cc h thng c kh nng nhn dng cc cu lin tc. Ni cch khc, v mt l thuyt, cc h
thng loi ny khng yu cu ngi s dng (ngi ni) phi ngng trong khi ni. Cc h
thng x l ngn ng t nhin c ng dng khng ch trong cc h thng nhn dng ting ni.
Cc h thng s dng cc phng php tnh ton cn thit cho cc my c th hiu c
ngha ca ting ni ang c ni thay v ch n gin bit c t no c ni.
Chng 5: Nhn dng ting ni
59
Mt cch tng qut, Victo Zue v ng nghip nh ngha mt s tham s v dng n
phn chia cc h thng nhn dng theo cc tham s nh trnh by trong bng 5.1.

Tham s Phn loi in hnh
n v ting ni Ri rc (cc t n l) Lin tc (cc cu lin tc)
Hun luyn Hun luyn trc khi s dng - Hun luyn lin tc
Ngi s dng Ph thuc - c lp
T vng S lng nh - S lng ln
SNR Thp Cao
B chuyn i Hn ch - Khng hn ch
Bng 5.1: Cc tham s v phn loi h thng nhn dng tng ng
5.4. Cu trc h nhn dng ting ni
Hnh 5.1 l cu trc nguyn l ca mt h thng nhn dng ting ni. Tn hiu ting ni
trc ht c x l bng cch p dng mt trong cc phng php phn tch ph ngn hn
hay cn c gi l qu trnh trch chn c trng hoc qu trnh tin x l (front-end
processing). Kt qu thu c sau qu trnh trch chn c trng l tp cc c trng m hc
(acoustic features) c to dng thnh mt vc-t. Thng thng khong 100 vc-t c
trng m hc c to ra ti u ra ca qu trnh phn tch trong mt n v thi gian mt
giy.

Hnh 5.1 Cu trc tng qut ca mt h thng nhn dng ting ni

Vic so snh (matching) trc ht thc hin bng vic hun luyn xy dng cc c trng,
sau s dng so snh vi cc tham s u vo thc hin vic nhn dng. Trong qu
trnh hun luyn h thng dng vc-t cc c trng c a vo h thng c lng cc
tham s ca cc mu tham kho (reference patterns). Mt mu tham kho c th m phng
(model) mt t, mt m n (a single phoneme) hoc mt n v ting ni no (some other
speech unit). Ty thuc vo nhim v ca h thng nhn dng, qu trnh hun luyn h thng
s bao gm mt qu trnh x l nhiu t phc tp. Chng hn vi h thng nhn dng ph
thuc ngi ni (speaker dependent recognition), c th ch bao gm mt vi hoc duy nht
Chng 5: Nhn dng ting ni
60
mt biu din (utterances) cho mi t cn c hun luyn. Tuy nhin, i vi h thng nhn
dng c lp vi ngi ni, c th bao gm hng ngn biu din tng ng vi tn hiu ca
mu tham kho mong mun. Nhng biu din ny thng l b phn (part) ca mt c s d
liu ting ni c thu thp trc y. Cn ch rng vic trch chn cc c trng tiu
biu (representative features) v xy dng mt m hnh tham kho (a reference model) l mt
qu trnh tn thi gian v l mt cng vic phc tp.
Trong qu trnh nhn dng, dy cc vc-t c trng c em so snh vi cc mu tham
kho. Sau , h thng tnh ton tng ng (likelihood - ging nhau) ca dy vc-t
c trng v mu tham kho hoc chui mu tham kho. Vic tnh ton ging nhau thng
c tnh ton bng cch p dng cc thut ton hiu qu chng hn nh thut ton Viterbi.
Mu hoc dy mu c tng ng (likelihood) cao nht c cho l kt qu ca qu trnh
nhn dng.
Hin nay, cc phng php trch chn c trng ph bin thng l cc mch lc Mel
(Mel filterbank) kt hp vi cc bin i ph Mel sang min cepstral. Chng ta s tm hiu s
tin x l c tiu chun ha nh mt phng php tin x l bi ETSI. M hnh mu
tham chiu thng l cc m hnh Markov n (HMMs).
5.5. Cc phng php phn tch cho nhn dng ting ni
5.5.1 Lng t ha vc-t
Chng ta thy rng, kt qu ca cc php phn tch trch chn tham s l dy cc vc-t
c trng ca c tnh ph thay i theo thi gian ca tn hiu ting ni. thun tin, chng
ta k hiu cc vc-t ph l v
l
, l=1,2,, L, trong mi vc-t thng l mt vc-t c chiu
di p. Nu chng ta so snh tc thng tin ca cc biu din vc-t v cc biu din trc
tip dng sng tn hiu (uncoded speech waveform), chng ta thy rng cc phn tch ph cho
php chng ta gim nh i rt nhiu tc thng tin yu cu. Ly v d, vi tn hiu ting ni
c ly mu vi tn s ly mu 10kHz, v s dng 16bt biu din bin ca mi mu.
Khi biu din raw cn 160000bps lu tr cc mu tn hiu. Trong khi , i vi phn
tch ph, gi s chng ta s dng cc vc-t c di p=10 v s dng 100 vc-t ph trong
mt n v thi gian mt giy. V chng ta cng s dng chnh xc 16 bt biu din mi
thnh phn ph, khi chng ta cn 100x10x16bps hay 16000bps lu tr. Nh vy
phng php phn tch ph cho php gim i 10 ln. T l gim ny l cc k quan trng
trong vic lu tr. Da trn khi nim cn ti thiu ch mt biu din ph n l cho mi n
v ting ni, chng ta c th lm gim nh thm na cc biu din ph raw ca tn hiu thnh
cc thnh phn t mt tp nh hu hn cc vc-t ph duy nht m mi thnh phn tng ng
vi mt n v c bn ca tn hiu ting ni (tc l cc phoneme). L tt nhin, mt biu din
l tng l kh c th t c trong thc t bi v c qu nhiu cc bin s trong cc tnh
cht ph ca mi mt n v tn hiu ting ni c bn. Tuy nhin, khi nim v vic xy dng
mt b m (codebook) gm cc vc-t phn tch phn bit, mc d c s t m nhiu hn tp
c bn cc phoneme, vn l mt tng hp dn v l tng c bn nm trong mt lot cc
k thut phn tch c gi chung l cc phng php lng t ha vc-t. Da trn cc suy
lun trn, gi s chng ta cn mt b m vi khong 1024 vc-t ph c nht (tc l khong
25 dng khc nhau ca mi tp 40 n v tn hiu ting ni c bn). Nh th, biu din mt
vc-t ph bt k, tt c chng ta cn l mt s 10 bt - khi ch s ca vc-t b m ph
hp nht vi vc-t vo. Gi s rng tc 100 vc-t ph trong mt n v thi gian mt
Chng 5: Nhn dng ting ni
61
giy, chng ta cn tng tc bt vo khong 1000bps biu din cc vc-t ph ca tn
hiu. Ta thy rng, tc ny ch bng khong 1/16 tc cn thit ca cc vc-t ph lin
tc. Do , phng php biu din lng t ha vc-t l mt phng php c kh nng biu
din cc k hiu qu cc thng tin ph ca tn hiu ting ni.
Trc khi tho lun cc khi nim lin quan n vic thit k v thc hin mt h lng t
vc-t thc t, chng ta im li cc u im v nhc im ca phng php ny. Trc ht,
cc u im chnh ca phng php biu din lng t vc-t bao gm:
Cho php gim nh vic lu tr thng tin phn tch ph tn hiu. iu ny cho php to
thun li cho vic p dng trong cc h thng nhn dng tn hiu ting ni thc t.
Cho php gim nh vic tnh ton xc nh s ging nhau (tng ng - similarity) ca
cc vc-t phn tch ph. Chng ta bit rng, trong php nhn dng tn hiu ting ni, mt
bc quan trng trong vic tnh ton l quyt nh tng ng ph ca mt cp vc-t. Da
trn biu din lng t ha vc-t, vic tnh ton tnh tng ng ph tn hiu thng c
gim xung thnh mt php tra bng ca s ging nhau gia cc cp vc-t m.
Cho php biu din ri rc tn hiu m thanh ting ni. Bng vic gn mt nhn phonetic
(hoc c th l mt tp cc nhn phonetic hoc mt lp phonetic) vi mt vc-t m, qu
trnh chn ra mt vc-t m biu din mt vc-t ph cho trc ph hp nht tr thnh vic
gn mt nhn phonetic cho mi khung ph ca tn hiu. Mt lot cc h thng nhn dng
ting ni tn ti s dng nhng nhn ny cho php nhn dng mt cch hiu qu.
Tuy vy cng phi k n mt s hn ch ca vic s dng b m lng t ha vc-t
biu din cc vc-t ph tn hiu ting ni. Chng bao gm:
Tn ti s mo ph k tha (inherent) trong vic biu din vc-t phn tch thc t. Do ch
c s lng hu hn vc-t m, qu trnh chn vc-t thch hp nht biu din mt vc-t ph
cho trc tng t nh qu trnh lng t mt vc-t v kt qu l dn n mt sai s lng
t no . Sai s lng t gim khi s lng cc vc-t m tng. Tuy nhin, vi mi b m c
s vc-t m hu hn th lun tn ti mt mc sai s lng t.
Dung lng lu tr cho cc vc-t m thng l khng bt thng (nontrivial). Nu b m
cng ln, ngha l cng gim nh sai s lng t, th dung lng lu tr cc thnh phn b
vc-t m yu cu cng cao. Vi cc b m c kch thc ln hn hoc bng 1000, th dung
lng lu tr thng l khng bt thng. Nh vy c mt s mu thun gia sai s lng t,
qu trnh la chn vc-t m, v dung lng lu tr cc vc-t m. Trong cc thit k ng
dng thc t cn phi cn bng ba yu t ny.
a) S thc hin lng t ha vc-t
S khi ca cu trc phn loi (classification) v hun luyn s dng lng t ha vc-
t c bn c trnh by trong hnh 5.2. Mt tp ln cc vc-t phn tch ph v
1
, v
2
, , v
L
to
thnh tp cc vc-t dng hun luyn. Tp cc vc-t ny dng to ta mt tp ti u cc
vc-t m biu din cc bin ph quan st c trong tp hun luyn. Nu chng ta k
hiu kch c ca b m lng t ha vc-t l M=2
B
(chng ta gi y l mt b m B-bt),
khi chng ta cn c L>> M c th tm c mt tp gm M vc-t ph hp nht. Trong
thc t, ngi ta thy rng, qu trnh hun luyn b m lng t vc-t hot ng tt, L
thng phi ti thiu bng 10M. Tip n l qu trnh o lng ging nhau hay cn gi l
khong cch gia cc cp vc-t phn tch ph nhm c th phn hoch (cluster) tp cc
Chng 5: Nhn dng ting ni
62
vc-t hun luyn cng nh gn hoc phn loi cc vc-t ph thnh cc thnh phn ca b
m duy nht. Khong cch ph gia hai vc-t ph v
i
v v
j
c k hiu l d
ij
=d(v
i
, v
j
). Qu
trnh tip tc phn loi tp L vc-t hun luyn thnh M phn hoch v chng ta chn M vc-
t m nh l tp trung tm (centroid) ca mi mt phn hoch . Th tc phn loi cc vc-
t phn tch ph tn hiu ting ni xc nh thc hin vic chn vc-t m gn nht vi vc-t
nhp vo v s dng ch s m nh l kt qu biu din ph. Qu trnh ny thng c gi
l vic tm kim ln cn gn nht hoc th tc m ha ti u. Th tc phn loi v c bn l
mt b lng t ha vi u vo l mt vc-t ph tn hiu ting ni v u ra l ch s m
ha ca mt vc-t m m gn ging vi u vo nht (best match)

Hnh 5.2 M hnh s dng vc-t lng t hun luyn v phn loi

b) Tp hun luyn b lng t ha vc-t
c th hun luyn b m lng t ha vc-t mt cch chnh xc, cc vc-t thuc tp
hun luyn phi bao ph (span) cc kha cnh mong mun nh sau:
Ngi ni, bao gm cc nhm (ranges) v tui tc, trng m (accent), gii tnh, tc ni,
cc mc v cc bin s khc.
Cc iu mi trng chng hn nh phng yn lng hay trn -t (automobile), hoc khu
lm vic n o (noisy workstation).
Cc b chuyn i (transducers) v cc h thng truyn dn, bao gm c cc mi-c-r bng
thng rng, cc ng nghe (handset) in thoi (vi cc mi-c-r cc-bon v in than), cc
truyn dn trc tip, knh tn hiu in thoi, knh bng thng rng, v cc thit b khc.
Cc n v ting ni bao gm cc t vng s dng nhn dng c bit (chng hn cc ch
s) v ting ni lin tc (conversational speech)
Mc tiu hun luyn cng hp cng r rng (chng hn vi s lng ngi ni hn ch,
ting ni trong phng yn lng, ...) th sai s lng t khi s dng vic biu din ph tn hiu
vi b m kch thc c nh cng nh. Tuy nhin c th ng dng gii quyt nhiu loi
bi ton thc t, tp hun luyn phi cng ln cng tt.
c) o lng s tng ng hay khong cch
Khong cch ph gia cc vc-t ph v
i
v v
j
c nh ngha nh sau:

( )
ij
0
,
0
i j
i j
i j
v v
d v v d
v v
=

= =

>

(5.1)
Chng 5: Nhn dng ting ni
63
d) Phn hoch cc vc-t hun luyn
Th tc phn hoch tp L vc-t hun luyn thnh mt tp gm M b vc-t m c th
c m t nh sau:
Bt u: Chn M vc-t bt k t tp L vc-t hun luyn to thnh mt tp khi u cc
t m ca b m.
Tm kim ln cn gn nht: Vi mi vc-t hun luyn, tm mt vc-t m trong b ang
xt gn nht (theo ngha khong cch ph) v gn vc-t vo tng ng.
Cp nht centroid: Cp nht t m trong mi bng cch s dng centroid ca cc vc-t
hun luyn trong cc .
Lp: Lp li cc bc 2 v 3 cho n khi khong cch trung bnh nh hn mt khong
ngng nh sn.
e) Th tc phn loi vc-t
Vic phn loi cc vc-t i vi cc vc-t ph bt k v c bn l vic tm ht trong b
m tm ra c mt vc-t tng ng nht. Chng ta k hiu b vc-t m ca mt b m
M vc-t l y
m
, (1 m M) v vc-t ph cn phn loi (v lng t ha) l v, khi ch s
m
*
ca t m ph hp nht c xc nh nh sau:
( )
*
1
arg min ,
m
m M
m d v y

= (5.2)
Vi cc b m c gi tr M ln (chng hn M 1024), vic tnh ton theo cng thc (5.2)
s tr ln phc tp (be excessive), v ph thuc vo tnh ton chi tit ca qu trnh o lng
khong cch ph. Trong thc t, ngi ta thng s dng cc thut gii cn ti u (sub-
optimal) tm kim.
5.5.2 B x l LPC trong nhn dng ting ni
Trong phn trc chng ta tho lun v cc tnh cht chung nht ca phng php phn
tch LPC. Trong phn ny chng ta s m t chi tit vic s dng b x l LPC cho cc h
thng nhn dng tn hiu ting ni. S khi ca khi x l LPC c trnh by trong hnh
5.3. Cc bc c bn trong qu trnh x l ca b x l nh sau:

Hnh 5.3 S khi b x l LPC trong nhn dng ting ni

Chng 5: Nhn dng ting ni
64
a) Tin nhn tn hiu
u tin tn hiu ting ni dng s ha s(n) c a qua mt h thng lc s bc thp,
thng l b lc p ng xung hu hn (FIR) bc nht, nhm lm phng ph tn hiu. iu
ny s gip cho tn hiu t b nh hng ca cc php bin i x l tn hiu c chnh xc
hu hn trong sut qu trnh sau . B lc s s dng cho vic tin nhn tn hiu c th l
mt b lc vi cc tham s c nh hoc c th l mt b lc thch nghi c cc tham s thay
i chm. Trong x l tn hiu ting ni, ngi ta thng dng mt h thng mch lc bc
nht c cc tham s c nh c dng:
( ) ( )
1
1 0, 9 1, 0 H z az a

= (5.3)
Khi , tn hiu u ra ca b tin nhn ( ) s n c th tnh nh sau:
( ) ( ) ( ) 1 s n s n as n = (5.4)
Gi tr ph bin ca h s c nh a l khong 0,95 (trong cc ng dng thc thi vi du
phy tnh gi tr ca a thng c chn l 15/16=0.9375). Hnh 5.4 biu din bin c
tnh hm truyn t
( )
j
H e

vi gi tr 0, 95 a = . T hnh v, chng ta c th quan st thy
rng ti = , tc l bng mt na tc ly mu, c s gia tng (boost) bin khong
32dB so vi bin tn s 0 = .

Hnh 5.4 Ph bin ca mch tin nhn tn hiu

Trong trng hp mch lc thch nghi c s dng, hm truyn t ca n thng c
dng:
( )
1
1
n
H z a z

= (5.5)
Trong
n
a thay i theo thi gian n theo mt tiu ch thch nghi c thit k trc. Mt
gi tr in hnh thng c s dng l ( ) ( ) 1 / 0
n n
a r r = .
b) Phn khung tn hiu
Kt qu tn hiu sau khi tin nhn tn hiu l mt khung tn hiu ( ) s n gm cc khung c
N mu, trong cc khung cnh nhau cch bit nhau M mu. Hnh 5.5 m t cc khung tn
hiu trong trng hp M=N/3. Ta thy, khung th nht gm N mu, khung th hai bt u
sau khung th nht M mu v c chung N-M mu vi khung th nht. Tng t nh vy,
khung th 3 bt u sau khung th nht 2M mu hay bt u sau khung th hai M mu v c
chung vi khung th nht v th hai tng ng l N-2M v N-M mu. Qu trnh ny c
tip tc cho n khi ton b tn hiu ca mt hoc mt s khung c phn khung xong. D
dng thy rng, nu MN th cc khung cnh nhau s c s bao trm ln nhau, v kt qu l
Chng 5: Nhn dng ting ni
65
cc c lng ph ca LPC s c s tng quan gia cc khung; nu M< <N th cc c
lng ph LPC gia cc khung s tng i trn tru (smooth). Mt khc, nu M>N, khi s
khng c s bao trm ln nhau gia cc khung; trong thc t khi mt phn tn hiu s b
mt hon ton (tc l khng xut hin trong bt c mt khung phn tch no), v khi tnh
tng h gia cc c lng ph LPC thu c ca cc khung cnh nhau s cha mt thnh
phn nhiu m bin ca n tng khi M tng (tc l khi s lng mu tn hiu b b qua
cng nhiu). y l trng hp khng th chp nhn c (intolerable) trong bt c php
phn tch LPC no s dng cho h thng nhn dng tn hiu ting ni. Gi khung tn hiu th
l l ( )
l
x n v gi s c ton b L khung tn hiu, khi :
( ) ( ) 0,1,..., 1; 0,1,..., 1
l
x n s Ml n n N l L = + = = (5.6)
iu ny c ngha l khung tn hiu u tin ( )
0
x n bao gm cc mu ( ) 0 s , ( ) 1 s , ,
( ) 1 s L ; khung tn hiu th hai ( )
l
x n bao gm cc mu ( ) s M , ( ) 1 s M + , , ( ) 1 s M N + ;
v khung tn hiu th L bao gm cc mu ( ) ( )
1 s M L , ( ) ( )
1 1 s M L + , ,
( ) ( )
1 1 s M L N + . i vi tn hiu ting ni c tc ly mu 6.67kHz th gi tr ca N v
M thng c chn tng ng l 300 v 100, ngha l tng ng vi cc khung 45 mili-giy
v khong cch gia cc khung l 15mili-giy.

Hnh 5.5 Phn khung tn hiu trong phn tch LPC cho nhn dng ting ni

c) Ly ca s tn hiu
Bc tip theo trong qu trnh x l phn tch LPC l vic ly ca s ca cc khung tn
hiu ring r nhm mc ch gim nh s khng lin tc ca tn hiu phn u v cui mi
khung. iu ny cng tng t nh cp trong phn gii thiu chung khi xem xt trong
min tn s: vic ly ca s tn hiu nhm mc ch ct b tn hiu v 0 phn bt u v kt
thc ca mi khung. Gi s hm ca s c nh ngha l w(n) (0nN-1), khi kt qu tn
hiu thu c sau khi ly ca s l:
( ) ( ) ( ) w 0 1
l l
x n x n n n N = (5.7)
Hm ca s ph bin dng cho phng php t tng quan trong LPC s dng trong cc
h thng nhn dng ting ni l hm ca s Hamming, trong biu thc hm c cho bi:
( )
2
w 0, 54 0, 46 os 0 1
1
n
n c n N
N
| |
=
|

\
(5.8)
d) Phn tch tnh t tng quan
Kt qu t tng quan ca mi khung tn hiu sau php ly ca s l:
Chng 5: Nhn dng ting ni
66
( ) ( ) ( )
1
0
0,1,...,
N m
l l l
n
n x n x n m m p

=
= + =

(5.9)
Trong , gi tr t tng quan cao nht p l bc ca phn tch LPC. Thng thng, p
c chn t 8 n 16. Cn ch n mt li ch ph ca vic s dng phng php t tng
quan l thnh phn t tng quan bc 0, tc l ( ) 0
l
, chnh l nng lng ca khung th l .
Nng lng ca khung tn hiu l mt tham s quan trng trong cc h thng pht hin tn
hiu ting ni.
e) Phn tch LPC
Bc tip theo trong qu trnh phn tch l php phn tch LPC, trong mi khu ca p+1
tham s t tng quan c chuyn i thnh mt tp cc tham s LPC. Tp cc tham s
LPC c th l tp cc h s LPC, hoc tp cc h s phn nh, hoc cc h s t l log, hoc
cc h s cepstral, hoc bt c bin i mong mun no t cc tp nu trn. Vic thc hin
bin i ny thng c thc hin bng cch p dng phng php Durbin c din gii
nh sau. thun tin, chng ta tm b ch s l trong biu thc ( )
l
r m .

( )
( )
0
0
l
E = (5.10)

( )
( )
( )
( )
( )
1
1
1
1
{ }
1
L
i
l j l
j
i
i
i i j
k i p
E

(5.11)

( ) i
i i
k = (5.12)

( ) ( ) ( ) 1 1 i i i
j j i i j
k

= (5.13)

( )
( )
( ) 1 2
1
i i
i
E k E

= (5.14)
Trong cng thc tnh tng ca cng thc th hai trn, (5.11), chng ta b qua trng hp
i=1. H cc phng trnh trn dc gii theo phng php truy hi vi i=1,2,, p v k qu
cui cng thu c l:

( )
( ) 1
p
m m
a m p = (5.15)

oef m c
k R = (5.16)

1
log
1
m
m
m
k
g
k
| |
=
|
+
\
(5.17)
(5.15) l cc h s LPC, (5.16) l cc h s phn x, v (5.17) l l-ga-rt cc h s t l
din tch.
f) Chuyn i cc tham s LPC sang cc h s Cepstral
Mt tp tham s quan trng c th xy dng trc tip t tp cc tham s LPC l tp cc
h s cepstral LPC. Cng thc xc nh s dng php quy c cho nh sau:

( )
2
0
ln c = (5.18)
( )
1
1
1
m
m m k m k
k
k
c a c a m p
m

=
| |
= +
|
\

(5.19)
Chng 5: Nhn dng ting ni
67
( )
1
1
m
m k m k
k
k
c c a m p
m

=
| |
= >
|
\

(5.20)
y,
2
l li ca vic s dng m hnh LPC. Cc h s cepstral chnh l cc h
s tng ng ca bin i Fourier ca cc gi tr l-ga-rt ca bin ph. Tp cc h s
cepstral c chng minh rng l mt tp cc c trng ng tin cy v robust hn tp cc h
s LPC, hay tp cc h s phn x cng nh tp cc h s t l log din tch trong vic nhn
dng tn hiu ting ni. Thng mt biu din gm Q>p h s cepstral c s dng, trong
ph bin Q3p/2.
g) Ly trng cc tham s - Parameter Weighting
Trong cc h s cepstral, cc h s bc thp rt nhy cm vi dc (slope) ca ton
di ph, trong khi cc h s bc cao th li rt nhy cm vi nhiu. Chnh v l do ny, n
dng nh tr thnh mt tiu chun ca cc php x l l s dng ly trng s cc h s
cepstral bng mt hm ca s nhm gim nh cc nhy cm ni trn. Mt cch thng thng
cho vic thay i vic s dng mt ca s cepstral l xem xt biu din Fourier ca l-ga-rt
ph bin v cc o hm l-ga-rt ca ph bin . Ngha l:

( )
log
j j m
m
m
S e c e

=
=

(5.21)

( ) ( ) log
j j m
m
m
S e jm c e

(
=


(5.22)
Thnh phn vi phn ca l-ga-rit ph bin c mt tnh cht c bit l bt c dc
ph c nh no trong l-ga-rt bin ph s tr thnh mt hng s. Hn na, bt c thnh
phn nh ph no trong l-ga-rt bin ph, tc l cc formant, u c bo m gi
nguyn trong vi phn ca l-ga-rt bin ph. Do , bng vic nhn biu din vi phn ca
l-ga-rt bin ph vi -jm, chng ta thc hin vic thay i trng cc tham s. Kt qu
chng ta c:

( )
log
j j m
m
m
S e c e

(
=


(5.23)
Trong :
( )

m m
c c jm = (5.24)
c th t c tnh robustness cho cc gi tr m ln, tc l cc trng s nh gn
m=Q, v c th ct b c phn tnh ton v nh trong cng thc (5.23), chng ta cn phi
a ra mt dng tng qut hn i vi cc h s trng s:
w
m m m
c c = (5.25)
Mt php ly trng s thch hp chnh l mt b lc thng di (b lc trong min
cepstral) c dng:
( ) w 1 sin 1
2
m
Q m
m Q
Q

( | |
= +
( |
\
(5.26)
Hm tnh ton trng s cho cng thc (5.26) c kh nng ct b phn tnh ton v hn
v gii nhn (de-emphasizes) cc h s
m
c xung quan m=1 v m=Q.
Chng 5: Nhn dng ting ni
68
h) Cc o hm Cepstral
Cc biu din cepstral ca ph tn hiu ting ni l mt biu din thch hp cho php
c t c cc tnh cht ph cc b ca tn hiu trong mt khung tn hiu phn tch xc nh.
Tuy nhin c th tng cht lng ca cc biu din ny bng cc m rng cc phn tch bao
gm cc thng tin v o hm ca cepstral theo thi gian (the temporal cepstral derivative).
Thc t cho thy rng c cc o hm cp mt v cp hai u mang li kh nng lm gia tng
cht lng hot ng ca h thng nhn dng tn hiu ting ni. a khi nim thi gian
vo cc biu din cepstral, chng ta k hiu h s cepstral th m thi im t l ( )
m
c t . Trong
thc t, thi im ly mu t gn vi khung tn hiu phn tch ch khng phi l mt thi im
bt k. Vic tnh o hm cc h s cepstral theo thi gian c thc hin mt cc xp x nh
sau: o hm theo thi gian ca l-ga-rt bin ph c biu din chui Fourier tng ng:

( )
( )
log ,
m j j m
m
c t
S e t e
t t

=

(
=

(5.27)
Do , o hm cepstral theo thi gian cng s c xc nh mt cch tng t. V
( )
m
c t l mt biu din thi gian ri rc (trong t l ch s khung tn hiu), chng ta khng
th p dng trc tip cc vi phn cp mt v cp hai xp x vi cc o hm (v iu ny
dn n kt qu nhiu rt ln it is very noisy). Do , mt cc tnh ton hp l l xp x
( ) /
m
c t t bi mt a thc ni suy trc giao gn ng (an orthogonal polynomial fit), mt
c lng bnh phng ti thiu ca cc o hm (a least-squared estimate of the derivative),
trn ton khong ca s hu hn. Ngha l:

( )
( ) ( )
K
m
m m
k K
c t
c t kc t k
t

=

= +


(5.28)
Trong , l mt hng s chun ha thch hp v (2K+1) l s khung tn hiu m
trn chng ta thc hin vic tnh ton. Thng thng, gi tr ca K thng c ly bng 3
v thy rng gi tr ny thch hp cho vic tnh ton cc o hm cp mt. T th tc tnh ton
trn, vi mi khung tn hiu t, kt qu ca php phn tch LPC l mt vc-t gm Q h s
cepstral c k n trng v mt vc-t m rng ca Q thnh phn o hm theo thi
gian c k hiu l:
( ) ( ) ( ) ( ) ( ) ( ) ( )
1 2 1 2
' , ,..., , , ,...,
t Q Q
o c t c t c t c t c t c t = (5.29)
Trong cng thc (5.29), '
t
o l mt vc-t gm 2Q thnh phn v (.)' biu din php
chuyn v ma trn.
Mt cch tng t, nu chng ta thc hin vic tnh ton cc o hm cp hai ( )
2
m
c t
v thm cc gi tr ny vo vc-t
t
o ta s thu c mt vc-t mi gm 3Q thnh phn.
i) Bng cc gi tr ph bin ca cc tham s trong phn tch LPC
Trong cc phn tch tnh ton theo phng php phn tch LPC, chng ta thy rng cc
tnh ton ph thuc vo s lng cc tham s bin s bao gm: s mu trong khung tn hiu
phn tch N, s mu phn cch im bt u ca cc khung lin k M, bc ca phn tch LPC
p, kch c ca vc-t cepstral c xy dng Q, s lng khung K m trn cc o hm
theo thi gian ca cc h s cepstral c tnh ton. Mc d mi mt gi tr ca cc tham s
Chng 5: Nhn dng ting ni
69
va k thay i trn mt di rt ln ph thuc vo cc h thng c th, mt s gi tr ph bin
i vi ba tn s ly mu tng ng l 6,67kHz, 8kHz v 10kHz c cho trong bng sau.

Gi tr tham s F
s
=6,67kHz F
s
=8kHz F
s
=10kHz
N 300 (45ms) 240 (30ms) 300 (30ms)
M 100 (15ms) 80 (10ms) 100 (10ms)
p 8 10 10
Q 12 12 12
K 3 3 3
Bng 5.2: Mt s gi tr tham s ph bin ca php phn tch LPC

5.5.3 Phn tch MFCC trong nhn dng ting ni
S khi phng php phn tch cepstral tn s Mel (Mel frequency Cepstral analysis)
dng trch chn c trng tn hiu ting ni c trnh by trong hnh 5.6. y l mt k
thut ph bin i din cho lp phng php trch chn c trng c tn gi l MFCCs (Mel
frequency cepstral coefficients). u tin, tn hiu ting ni c lc bi mt mch lc thng
cao (high-pass filter) vi tn s ct (cut-off frequency) rt thp nhm loi b thnh phn tn
hiu mt chiu m c th do b chuyn i ADC to ra. c bit vic lc ny l cn thit
tng tnh chnh xc khi thc hin tnh ton nng lng tn hiu theo khung trong cc phn tch
ngn hn. Nng lng tn hiu cng nh cc tham s cepstral c tnh i vi mi khung
ca s dch vi khong dch d
shift
=10ms. Do vic cm nhn m thanh ca con ngi theo
thang khng tuyn tnh nn vic tnh nng lng tn hiu thng l dng thang l-ga-rt. Nng
lng khung theo l-ga-rt (logarithmic frame energy) c s dng nh mt thnh phn ca
vc-t c trng tn hiu. Sau mt mch lc thng cao khc c s dng tin nhn tn
hiu nhm mc ch tng cng cc thnh phn tn hiu vng tn cao vng m tn hiu c
xu th c nng lng thp. Ph tn hiu ngn hn c tnh sau bng cch nhn cc mu
ca khung tn hiu vi mt ca s Hamming v s dng php bin i Fourier nhanh (FFT).
n y ch c bin ph c ly ra bi v ph pha ngn hn khng cha cc thng tin c
ch ca tn hiu ting ni. Chng ta bit rng, h thng m thanh (auditory) ca con ngi tch
ly (accumulate) cc nng lng theo nhng di chnh (critical bands). Da vo c im ny,
h mch lc thang Mel (Mel-scale filterbank) c s dng. H mch lc ny gm 23 bng
con (subbands). Cc thnh phn FFT ph c nhn vi mt hm tam gic v c
accumulated vo mt vng tn s xc nh to thnh mt thnh phn ph Mel. B rng ca
cc di tn tng dn khi tn s tng theo quan h tuyn tnh v tn s Mel. Vi nng lng tn
hiu ngi ta tnh ton l-ga-rt ca cc ph Mel. Cc thnh phn tn Mel cnh nhau c tnh
tng quan cao (fairly correlated). trch chn cc thnh phn c trng tng i c tp
thng k vi nhau, ngi ta p dng php bin i Cosine ri rc (DCT) cho cc l-ga-rt ph
Mel. Cc c trng c lp thng k ny s to thun li cho vic m hnh cc c tnh ca tn
hiu ting ni trong cc m hnh tham chiu (reference models) v vic tnh ton cc tng
ng trong qu trnh so snh i chiu mu.
Chng 5: Nhn dng ting ni
70

Hnh 5.6 S khi qu trnh phn tch MFCC
Vi phng php tin x l theo tiu chun a ra bi ETSI th c 13 h s cepstral
c tnh ton bao gm c h s cepstral th 0. Ch rng h s cepstral th 0 biu din gi
tr trung bnh (mean) ca l-ga-rt ph Mel. Do , gi tr ny c quan h mt thit vi nng
lng khung. Thng th hoc l l-ga-rt nng lng khung c tnh t tn hiu thi gian
hoc l h s cepstral th 0 c s dng nh mt tham s trong qu trnh nhn dng tn hiu
ting ni. Cc vc-t c trng cho vic nhn dng ting ni thng bao gm l-ga-rt nng
lng khung v 12 h s cepstral C
1
n C
12
. p dng cc k thut thch ghi nhm nng
cao cht lng h thng nhn dng, chng ta cn thit bit tham s C
0
. V do C
0
thng
c trch ra mt cch c bit s dng cho qu trnh hun luyn, v C
0
tr thnh mt
tham s ca HMM. Ngha l mt tp cc h s cepstral trong cc mu tham chiu c th c
bin i ngc li thnh ph Mel. Tuy nhin cn ch rng thnh phn C
0
khng c s
dng cho qu trnh nhn dng mu.
Cc tham s m hc gii thiu phn trn c gi l cc tham s tnh v chng c
tnh t tn hiu ting ni cho mt khung ngn khong 25ms. Do , tng cht lng h
thng nhn dng, mt lot cc tham s ng cn c quan tm. iu ny c th c hin
thc bng vic quan st ng bin i (contour) ca mi tham s tnh theo thi gian v tnh
ton vi phn (derivative) ca cc ng dch chuyn ny. Cc tham s c tnh ton theo
Chng 5: Nhn dng ting ni
71
cch ny c gi l cc h s en-ta. Ta c vi phn bc nht ( )
i
C k ca h s cepstral
i
C
c tnh theo cng thc:
( )
( ) ( )
1
2
1
N
i i
j
i N
j
j C k j C k j
C k
j

=
=
+ (

=

(5.30)
H s N

trong cng thc (5.30) thng c chn bng 3. Khi cc h s en-ta c


th c tnh t 7 khung. Ngha l chng cha ng thng tin v cc biu hin ng ca tn
hiu trong khong thi gian khong 85ms. Mt cch tng t, cc vi phn cp hai cng c th
c tnh bng cch p dng (5.30) cho cc ng bin i ca cc vi phn cp mt. Cc h
s thu c t cc vi phn cp hai ny c gi l cc h s en-ta-en-ta. Thi gian cho vic
tnh ton cc vi phn cp hai thng l thp hn cho vic tnh ton vi phn cp mt, do
tng khong thi gian cho vic xc nh cc h s en-ta-en-ta ca mt on tn hiu khong
150ms. Cc h s en-ta v en-ta-en-ta c thm vo cng vi cc tham s tnh to
thnh cc vc-t c trng. Thng thng, vc-t c trng ph bin gm khong 39 thnh
phn bao gm c l-ga-rt nng lng khung v 12 h s cepstral t C
1
n C
12
.
c th tng tnh nht qun (robust) ca vic trch chn c trng tn hiu khi c
nhiu nn (background noise) v cc hm truyn t khng bit trc ngi ta s dng s
trch chn c trnh by trong hnh 5.7. y cng l s tin x l tn hiu c tiu chun
ha bi ETSI. Trong s ny, ngoi khi trch trng chng ta cp n phn trn, hai
khi x l c thm vo. Th nht l khi gim nhiu, n bao gm mt mch lc Wiener
hai tng (2-stage). Tn hiu sau khi c gim nhiu c a vo khi phn tch cepstral nh
m t. gim nh nh hng ca cc hm truyn t khng bit (unknown) i vi cc
tham s trch chn ra, mt khi cn bng m (blind equalization) c s dng. Khi ny
lm vic trn nguyn l so snh ph ting ni vi mt ph phng v s dng thut ton sai s
bnh phng nh nht (LMS - Least mean square) iu chnh b lc cn bng.


Hnh 5.7 S khi ci thin phng php phn tch Cepstral

5.6. Gii thiu mt s phng php nhn dng ting ni
Trong phn ny, chng ta s tm hiu s lc mt s phng php s dng trong cc h
thng nhn dng tn hiu ting ni. Ngoi phn s lc v nguyn l chng ta cng s xem
xt n cc im mnh v im yu ca mi phng php.
Mt cch khi qut, c ba hng chnh c s dng trong cc h thng nhn dng
ting ni. l: phng php m thanh - m v (acoustic-phonetic); phng php nhn dng
mu (pattern recognition) v phng php s dng tr tu nhn to.
Phng php acoustic-phonetic l phng php da trn c s l thuyt m v trong
gi thit rng ngn ng ting ni tn ti mt s n v m v phn bit v hu hn, v rng
Chng 5: Nhn dng ting ni
72
cc n v m tit (phonetic) c c t mt cch y bi mt tp cc tnh cht ph hp
vi tn hiu ting ni, hoc ph ca chng. Mc d cc c tnh m hc ca cc n v m tit
thay i rt ln i vi c ngi ni (speaker) v vi cc n v m tit ln cn (cn gi l co-
articulation of sound), chng ta gi thit rng nhng quy lut qun l s thay i trn c th
suy ra mt cch d dng v c th hc v p dng vo cc tnh hung thc t. V do , bc
u tin trong vic s dng phng php acoustic-phonetic vo vic nhn dng tn hiu ting
ni l vic phn on (segmentation) v gn nhn. Qu trnh ny nhm phn on tn hiu
ting ni thnh cc vng ri rc (theo thi gian) trong cc c tnh m hc ca tn hiu l
i din ca mt (hoc vi) n v m tit (hoc cc lp). Sau gn mt hoc nhiu nhn m
tit vi mi on ty theo cc tnh cht m hc ca on . Bc tip theo trong qu trnh
nhn dng l vic c gng quyt nh mt t hp l (hoc mt chui t) t mt dy cc nhn
m tin c to ra t bc u tin.
Phng php nhn dng mu trong nhn dng ting ni l phng php trong cc
mu ting ni c s dng trc tip m khng cn phi xc nh r rng c trng (theo
ngha c trng m hc) v khng cn qu trnh phn on. Cng ging nh mi phng
php nhn dng mu khc, phng php ny gm hai bc: hun luyn cc mu tn hiu ting
ni; nhn dng cc mu thng qua vic s snh cc mu. Thng tin (hiu bit - knowledge) v
tn hiu ting ni c a vo h thng trong qu trnh hun luyn h thng. Nguyn l ca
vic ny l nu c cc phin bn ca mt mu cn nhn dng (mu ca m, ca t, hoc
ca mt cm t ...) trong tp dng hun luyn, th qu trnh hun luyn s c th c t mt
cch chnh xc cc c tnh m hc ca mu (m khng cn quan st hoc thng tin ca bt
c mu no khc trong qu trnh hun luyn). Qu trnh so snh mu thc hin vic so snh
trc tip tn hiu ting ni cha bit (tn hiu ting ni cn nhn dng) vi mi mt mu hc
c trong qu trnh hun luyn v phn loi tn hiu ting ni cha bit theo tng hp
vi mu. Phng php nhn dng mu c cc u im:
- S dng n gin.
- Nht qun v khng thay i vi cc b t vng, ngi s dng, tp cc c trng
khc nhau. iu ny cho php thut ton c th p dng mt cch rng ri vi cc loi n v
tn hiu ting ni (t cc n v phonemelike, t, cm t hoc cu), cc b t vng, s ng
ngi ni, cc mi trng nn khc nhau...
- C cht lng tt. Ngi ta ch ra rng vic s dng phng php nhn dng mu
trong nhn dng ting ni lun cho php h thng hot ng tt i vi bt k nhim v no
vi yu cu cng ngh va phi.
Phng php s dng tr tu nhn to trong nhn dng tn hiu ting ni l phng php
lai ghp gia hai phng php k trn. Phng php ny c gng c ch ha th tc nhn
dng tng t nh cch thc con ngi p dng tr tu vo vic quan st (visualizing), phn
tch v cui cng l ra quyt nh trn cc c tnh m hc o lng c. c bit mt trong
cc k thut c s dng cho cc phng php thuc lp phng php ny l vic s dng
h chuyn gia phn on v gn nhn. Bng cch ny, bc kh khn nht v quan trng
nht trong qu trnh nhn dng c th c thc hin khng ch vi cc thng tin m hc nh
trong cc phng acoustic-phonetic thun ty; hc v thch ng theo thi gian; s dng mng
n-ron cho vic hc cc mi quan h gia cc m tit v tt c cc u vo bit cng nh
cho vic phn bit s ging nhau gia cc lp m.
Chng 5: Nhn dng ting ni
73
Vic s dng mng n-ron c th to ra mt phng php cu trc ring r cho vic
nhn dng tn hiu ting ni hoc c th c coi nh mt cu trc c th thc thi c, cu
trc m c th tch hp vo mt trong ba phng php va k.
5.6.1 Phng php acoustic-phonetic
Hnh 5.8 miu t s khi ca mt h thng nhn dng tn hiu ting ni s dng
phng php acoustic-phonetic. Bc u tin trong qu trnh x l, cng ging nh trong tt
c cc phng php nhn dng tn hiu ting ni khc, l vic phn tch tn hiu ting ni.
Vic phn tch tn hiu ting ni (cn c gi l phng php o lng cc c trng ca tn
hiu) a ra mt biu din ph ph hp nht i vi cc c trng ca tn hiu ting ni thay
i theo thi gian. Nh cp, cc phng php ph bin nht trong vic phn tch ph tn
hiu ting ni trong mt h thng nhn dng tn hiu ting ni l phng php phn tch LPC.
Ni mt cch tng qut, vic phn tch ph tn hiu ting ni c nhim v a ra c cc
biu din ph thch hp ca tn hiu ting ni theo thi gian.


Hnh 5.8 S khi mt h thng nhn dng ting ni theo phng php acoustic-phonetic

Bc tip theo trong qu trnh x l l giai on pht hin cc c trng. tng y
l chuyn i cc o lng ph thnh mt tp cc c trng sao cho c th m t mt cch
bao trm cc tnh cht m hc ca cc n v m tit khc nhau. Trong cc c trng s dng
cho vic nhn dng tn hiu ting ni phi k n m mi (nasality) tc l s c mt hoc
khng ca cng hng khoang mi, m cng (frication) tc l s c mt hoc khng ca
ngun kch thch ngu nhin trong tn hiu, v tr cc tn s cng hng b my pht thanh
(formant) tc l cc tn s ca ba nh cng hng u tin, tn hiu hu thanh hay v thanh
tc l ngun kch thch l tun hon hay khng tun hon, v t l gia nng lng ca tn cao
v tn thp. Mt s c trng bn cht l nh phn (binary) chng hn nh m mi, m cng,
m hu thanh-m v thanh, tuy nhin mt s khc l lin tc chng hn nh v tr cc formant,
t s nng lng. Tng pht hin cc c trng thng bao gm mt tp cc b pht hin
(detector) hot ng song song v x dng php x l thch hp v l-gic a ra quyt
nh v s c mt hoc khng, hoc gi tr, ca mt c trng. Cc thut ton dng cho vic
pht bin cc c trng ring bit thng l rt phc tp v chng thng thc hin rt nhiu
Chng 5: Nhn dng ting ni
74
php bin i tn hiu, trong mt s trng hp chng c th l cc th tc c lng tm
thng (thng thng - trivial).
Bc th ba trong qu trnh l vic phn on v gn nhn. H thng c gng tm ra
vng n nh, vng m cc c trng thay i rt nh, v sau gn nhn cho cc vng va
c phn ra tng ng sao cho cc c trng trong vng ny tng ng tt vi cc c
trng tng ng ca cc n v m tit ring r. Giai on ny l giai on trung tm ca qu
trnh nhn dng tn hiu ting ni theo phng php acoustic-phonetic v n cng l mt giai
on kh khn nht c th trin khai mt cch tin cy. V l do , nhiu chin thut
(strategy) iu khin c s dng hn ch khong ca cc im phn on cng nh
cc kh nng gn nhn. Chng hn, i vi vic nhn dng cc t ring r, cc gii hn chng
hn nh mt t c cha t nht hai n v m tit v khng th nhiu hn su n v m tit
cho php chin lc iu khin ch cn quan tm n cc kt qu vi khong gia mt v nm
khong im phn on. Hn na, chin thut gn nhn c th tn dng cc gii hn v t
vng (lexical) ca cc t ch cn xem xt cc t vi n n v m tit, trong vic phn
on cho ta n-1 im phn on. Nhng iu kin hn ch va nu c vai tr quan trng cho
php chng ta gim nh khng gian tm kim v tng ng k cht lng hot ng ca h
thng.
Kt qu ca giai on phn on v gn nhn thng l mt li phoneme (phoneme
lattice). Li ny c s dng thc hin th tc truy xut t vng (a lexical access
procedure) nhm xc nh c mt t hoc mt dy t tng ng nht. Ngoi cc kiu li
phoneme, ngi ta cn c th xy dng li t hoc syllable bng cch kt hp cc iu kin
gii hn t vng v c php vo chin thut iu khin va c cp trn. Cht lng
ca vic so snh tng ng ca cc c trng vi cc n v m tit trong mt phn on c
th c s dng gn xc sut cho cc nhn v cc nhn ny sau c th c s dng
trong th tc truy xut t vng thng k (a probabilistic lexical access procedure). u ra ca
h thng nhn dng l mt t hoc mt dy t m tng ng nht theo mt kha cnh nh
trc vi dy cc n v m tit trong li phoneme.
a) B phn loi cc m v nguyn m
Chng ta cng xem xt th tc gn nhn trn mt phn on c phn loi nh mt
nguyn m. S hnh 5.9 m t lu phn loi nguyn m theo phng php acoustic-
phonetic. Chng ta gi s rng c ba c trng c pht hin trong phn on l formant
th nht F
1
, formant th hai F
2
v chiu di ca phn on D. Thm na chng ta ch xem xt
tp cc nguyn m n nh (steady), tc l loi b cc nguyn m kp (diphthongs). phn
loi mt phn on nguyn m trong 10 nguyn m n nh, mt s php th cn phi thc
hin phn tch cc nhm nguyn m. Nh trnh by trong hnh 5.9, php th u tin tch
cc nguyn m c tn s F
1
thp (cn gi l cc nguyn m khuch tn (diffuse) chng hn
nh /i/, /i/, /u/, ...) vi cc nguyn m c tn s cao (cn gi l cc nguyn m gn (compact)
bao gm /a/, ...). Mi tp con ny li c phn tch thm da vo tn s F
2
, trong cc
nguyn m acute (m sc) c tn s F
2
cao v cc nguyn m grave (m huyn) c tn s F
2

thp. Php kim tra th ba da trn khong thi gian ca phn on s phn tch cc nguyn
m cng (tense vowel), tc l cc nguyn m c gi tr D ln vi cc nguyn m lax (th lng),
tc l cc nguyn m c gi tr D nh. Cui cng, mt php kim tra mn hn (finer) i vi
cc gi tr formant phn tch cc nguyn m cha phn tch cn li to ra lp cc nguyn
Chng 5: Nhn dng ting ni
75
m bng (flat) tc l cc nguyn m c F
1
+F
2
ln hn mt ngng T no v cc nguyn
m n gin (plain) ( cc nguyn m c F
1
+F
2
nm di mt ngng T no )


Hnh 5.9 Mt phng php n gin phn loi nguyn m ting Anh

Cn ch rng, c mt s mc ngng c s dng trong b phn loi nguyn m.
Cc mc ngng ny thng c xc nh bng thc nghim sao cho c th tng ti a tnh
chnh xc ca php phn loi trn mt tp tn hiu ting ni cho trc.
b) Phn loi m thanh ting ni
Vic phn loi nguyn m ch l mt phn nh trong qu trnh gn nhn m tit ca
phng php nhn dng tn hiu ting ni acoustic-phonetic. V mt l thuyt, chng ta cn
phi c mt phng php phn loi mt phn on bt k no thnh mt hoc nhiu hn
mt trong s hn 40 n v m tit c tho lun trc y. Trong phn ny chng ta xem
xt mt bi ton phn loi n gin hn nhm phn loi mt phn on ting ni thnh mt
hoc mt s lp tn hiu ting ni, chng hn nh cc m v thanh ngt (unvoiced stop), m
hu thanh ngt (voiced stop), m v thanh xt (unvoiced fricative). Chng ta bit rng khng
tn ti mt th tc n gin hoc tng qut c chp nhn rng ri thc hin tc v ny,
tuy vy, hnh 5.10 m t mt phng php n gin trc gic hon thnh vic phn loi
nh vy.
Phng php ny s dng mt cy nh phn ra quyt nh cc lp tn hiu khc nhau.
Quyt nh u tin l phn chia lp m thanh/yn lng (sound/silence). quyt nh ny cc
c trng tn hiu ting ni (v c bn l nng lng trong trng hp ny) c so snh vi
Chng 5: Nhn dng ting ni
76
mt ngng c la chn, cc tn hiu yn lng c tch ra nu nh php th l m i
vi m thanh ting ni. Quyt nh th hai l vic phn lp cc m hu thanh v v thanh (c
s da trn vic xut hin tnh tun hon ca tn hiu trong phn on ang xt). Kt qu ca
quyt nh ny l cc m v thanh c tch khi cc m hu thanh. Bc tip theo l thc
hin mt php th phn tch cc ph m v thanh ngt (unvoiced stop consonants) khi
cc ph m v thanh xt (unvoiced fricatives). Bng php th tn s cao thp/tn s thp
(nng lng), chng ta c th phn tch cc m hu thanh xt (voiced fricatives) khi cc m
hu thanh. Cc m hu thanh ngt (voiced stop) c th c phn tch bng cch kim tra
xem m v trc c phi l yn lng (hoc gn ging yn lng). Cui cng mt php kim
tra ph nguyn m/ph m c tin hnh (tm kim khe ph) nhm tch cc nguyn m khi
cc ph m.


Hnh 5.10 Phng php phn loi m thanh ting ni da vo cy nh phn

Th tc phn tch nguyn m c trnh by trong hnh 5.9 c th c s dng thm
nh mt php phn loi mnh cc nguyn m.
Ch l th tc phn loi cp trn v minh ho trong hnh 5.10 ch mang tnh minh
ha s lc v c nhiu li. Chng hn, mt s m hu thanh ngt khng phi bt u bng
khong lng hoc m ging khong lng. Mt vn na l khng a ra c mt cch no
c th phn bit cc m kp (diphthongs) t cc nguyn m.
c) Mt s tn ti trong phng php nhn dng acoustic-phonetic
C rt nhiu vn tn ti trong phng php nhn dng tn hiu ting ni acoustic-
phonetic. Nhng vn ny lm cho phng php thiu s thnh cng trong cc h thng
nhn dng tn hiu ting ni thc t. Trong cc tn ti phi k n l:
Chng 5: Nhn dng ting ni
77
1. Phng php ny yu cu mt khi lng thng tin ln (extensive) v cc tnh cht
m hc ca cc n v m tit. Nhng thng tin ny thng l khng y v khng sn
sng ngoi tr nhng trng hp n gin.
2. Vic chn cc c trng c thc hin ch yu da trn cc xem xt ad hoc. Vi
hu ht cc h thng, vic chn cc c trng da trn cc nhn thc ch khng phi ti u
theo mt tiu ch nh sn v c ngha (a well-defined and meaningful sense)
3. Thit k cc b phn loi m thanh cng khng phi l cc thit k ti u. Phng
php ad hoc thng c s dng xy dng cc cy nh phn quyt nh. Gn y, cc
phng php cy hi quy (regression) v phn loi (CART) c s dng thay th cho
php cc cy quyt nh nht qun hn. Tuy vy, v vic la chn cc c trng ch yu l
cn ti u, cc thc thi ti u ca CART thng t khi t c.
4. Khng tn ti mt th tc nh sn v t ng no cho vic iu chnh phng php
(chng hn nh chnh cc ngng quyt nh, ...) trn cc tn hiu c gn nhn thc. Thc
t, thm ch cn khng c mt phng php l tng ca vic gn nhn tn hiu ting ni
hun luyn mt cch nht qun v c s ng rng ri ca cc chuyn gia ngn ng hc.
Do cc tn ti nu trn, mc d phng php nhn dng acoustic-phonetic l mt
tng kh th v nhng cn c nhiu nghin cu hiu bit hn na c th thc hin thnh
cng cc h thng nhn dng thc t da trn phng php ny.
5.6.2 Phng php nhn dng mu thng k
Hnh 5.11 m t s khi mt h thng nhn dng s dng phng php nhn dng
mu. Phng php nhn dng mu bao gm bn bc:
1. o lng cc c trng, trong mt dy cc php o lng c thc hin trn tn
hiu vo nh ra cc mu cn th. i vi tn hiu ting ni, cc o lng c trng
thng l cc u ra ca mt s phng php phn tch ph no , chng hn b phn tch
mng mch lc, mt b phn tch LPC, hoc l mt phn tch DFT.
2. Hun luyn mu, trong mt hoc nhiu mu kim tra tng ng vi cc m thanh
tn hiu ting ni ca cng mt lp c s dng to ra mt mu i din ca cc c
trng ca lp . Mu kt qu thu c, thng c gi l mu tham kho (hoc tham
chiu), c th tr thnh mt v d (examplar) hoc mt mu (template) c suy ra (derived)
t mt s phng php tnh trung bnh hoc c th tr thnh mt m hnh c t tnh thng k
ca cc c trng ca mu tham kho.
3. Phn loi mu, trong mu cn kim tra cha bit c so snh vi mi lp (m)
mu tham kho v mt o lng tng ng (khong cch) gia mu kim tra v mi mu
tham kho c tnh ton. so snh cc mu tn hiu ting ni (cc mu bao gm mt dy
cc vc-t ph), chng ta cn c o lng khong cch cc b, vi khong cch cc b c
nh ngha l khong cch ph gia hai vc-t ph c xc nh r, v mt th tc sp xp
thi gian ton cc (a global time alignment procedure) (thng c gi l mt thut ton
lch (warping) thi gian ng) nhm b li s khc bit tc ting ni (t l thi gian) ca
hai mu.
4. Quyt nh l-gic, trong im s v tnh tng ng ca mu tham chiu c s
dng quyt nh xem mu tham chiu no (hoc c th mt dy mu tham chiu) tng
ng nht vi mu kim tra cha bit.
Chng 5: Nhn dng ting ni
78
Cc yu t phn bit cc phng php nhn dng mu khc nhau l cc kiu o lng
c trng, s la chn cc mu (template) hoc cc m hnh cho cc mu tham chiu, v
phng thc c s dng to cc mu tham chiu v phn loi cc mu kim tra cha
bit.


Hnh 5.11 S khi ca mt h thng nhn dng s dng phng php nhn dng mu

Cc im mnh v im yu ca phng php nhn dng mu c th k n:
1. Cht lng ca h thng nhn dng theo phng php nhn dng mu nhy cm
(sensitive) vi s lng d liu hun luyn to ra lp cc mu tham chiu; thng thng,
cng hun luyn, cht lng ca h thng cng cao vi mi tc v.
2. Cc mu tham chiu nhy cm vi mi trng ting ni v cc tnh cht truyn dn
ca phng tin truyn dn to ting ni; iu ny l bi v cc c tnh ph tn hiu ting
ni thng d b nh hng bi qu trnh truyn dn v nhiu nn.
3. V khng c thng tin ting ni c th c s dng mt cch r rng (explicitly)
trong h thng, phng php ny tng i tr (insensitive) i vi vic chn cc t vng,
cc tc v, c php, v cc tc v ng ngha.
4. Khi lng tnh ton cho c qu trnh hun luyn mu v phn loi mu thng t l
thun vi s mu cn c hun luyn hoc c nhn dng; do vic tnh ton cho mt s
lng ln lp tn hiu m c th v thng tr ln khng th thc hin c (prohibitive)
5. Bi v h thng tr vi lp m thanh, cc k thut c bn c th p dng cho nhiu
lp tn hiu ting ni, bao gm cc cm t, t hon chnh, hoc cc n v con ca t (sub-
word). Do , chng ta s thy cch mt tp c bn cc k thut c pht trin cho mt lp
m (chng hn cho cc t) c th c p dng trc tip cho cc lp m khc (chng hn cho
cc n b sub-word) m khng cn thay i hoc thay i rt t i vi thut ton.
6. C th d dng (straightforward) kt hp cc iu kin hn ch c php (v thm ch
c ng ngha) mt cch trc tip vo cu trc nhn dng mu. Bng cch c th tng tnh
chnh xc ca vic nhn dng v gim nh khi lng tnh ton.
5.6.3 Phng php s dng tr tu nhn to
tng c bn ca phng php nhn dng tn hiu ting ni s dng tr tu nhn to
l bin dch v kt hp thng tin (hiu bit) t nhiu ngun thng tin v dng n gii bi
ton. Do , chng hn, phng php s dng tr tu nhn to vic phn on v gn nhn c
th c gia tng (augment) vic s dng thng tin m hc tng qut vi thng tin v
Chng 5: Nhn dng ting ni
79
phonemic, thng tin v t vng, thng tin v c php, thng tin v ng ngha, v thm ch c
cc thng tin thc dng (pragmatic knowledge). hiu r, chng ta nh ngha cc ngun
thng tin khc nhau nh sau:
- Thng tin m hc l cc d kin (evidence) cc m thanh (cc n v m tit nh
ngha sn) c ni trn c s cc o lng ph v s c mt hoc khng ca c trng.
- Thng tin t vng (lexical) l cc thng tin v s kt hp gia cc d kin m hc
to thnh cc cu trc t v c c th ha bi mt b t vng nh x cc m thanh vo cc
t (hoc tng ng tch cc t thnh cc m tng ng).
- Thng tin c php l cc thng tin v s kt hp ca cc t to thnh mt dy ng
ng php (theo mt m hnh ngn ng no ) chng hn nh cc cu hoc cc cm t.
- Thng tin ng ngha (semantic) l s hiu thng tin nhm c th nh gi c cc
cu hoc cc cm t m nht qun vi tc v ang c thc hin hoc nht qun vi cc cu
c gii m trc .
- Thng tin thc dng l cc thng tin cho php c kh nng suy din (inference) cn
thit nhm gii quyt trng hp c s mp m v ngha da trn hiu bit rng cc t hoc
cm t no thng c dng nhiu hn.
hiu ng v cc hn ch ca cc khi nim ngun thng tin va cp, chng ta
xem xt cc cu ting Anh sau:
1) Go to the refrigerator and get me a book.
2) The bears killed the rams.
3) Power plants colorless happily old.
4) Good ideas often run when least expected.
Chng ta thy rng, cu u tin l mt cu ng v mt c php nhng khng nht
qun v mt ng ngha, sch khng c mong ch t lnh. Cu th hai ty thuc vo
ng cnh m c ngha khc nhau. V d nu ng cnh l rng th n miu t s kin gu git
cu, tuy nhin nu chng ta ang ni n bng c th hiu l i c tn l nhng con gu
chin thng i c tn l nhng con cu. Cu th ba th hon ton khng ng c php
cng nh khng c ngha. Cu th t khng nht qun v mt ng ngha, tuy nhin theo hiu
bit thc dng c th n gin thay i "run" thnh "come" th s c ngha mc d c ch
khc bit v mt m tit.
Vic kt hp cc iu kin hn ch ca cc ngun thng tin va k s cho php h
thng nhn dng tn hiu ting ni hot ng vi cht lng cao hn. C nhiu cch kt hp
cc ngun thng tin va k vo mt h thng nhn dng. Phng php u tin ph bin nht
c th k n l b x l "bottom-up" c trnh by trong hnh 5.12. Trong phng php
"bottom-up", cc x l cp thp nht (chng hn nh trch chn c trng, gii m m tit, ...)
c thc hin trc cc php x l cp cao ( gii m t vng, m hnh ngn ng, ...) theo
mt th t ni tip sao cho iu kin hn ch ca mi bc x l l nh nht c th. Mt
phng php khc l phng php x l "top-down". Trong phng php ny m hnh ngn
ng to ra cc gi thuyt t (word hypotheses) ph hp vi tn hiu ting ni, v tip theo l
cc cu vi c php v ng ngha c ngha c xy dng da trn s im nh gi s tng
ng cc t. S phng php x l "top-down" c trnh by trong hnh 5.13. Mt
Chng 5: Nhn dng ting ni
80
phng php th ba phi k n l phng php "blackboard", c m t trong hnh 5.14.
phng php ny, tt cc cc ngun kin thc c xem xt mt cc c lp, mt lc gi
thit-v-kim tra c nhim v thc hin vic thng tin gia cc ngun thng tin. Mi ngun
thng tin l mt ngun iu khin d liu da trn s xut hin ca cc mu trn "blackboard"
m tng ng vi cc mu (template) c quy nh bi ngun thng tin . H thng hot
ng theo ch cn ng b, cc hm nh gi, cc xem xt s dng v mt chnh sch nh
gi ton cc kt hp v lan truyn vic nh gi mi mc .


Hnh 5.12 Phong php tch hp bottom-up ca h thng nhn dng ting ni

Chng 5: Nhn dng ting ni
81

Hnh 5.13 Phng php tch hp top-down ca h thng nhn dng ting ni


Hnh 5.14 Phng php tch hp blackboard ca h thng nhn dng ting ni

5.6.4 ng dng mng n-ron trong h thng nhn dng ting ni
Chng ta bit rng, c rt nhiu ngun thng tin (kin thc) khc nhau cn c thit
lp trong h thng nhn dng tn hiu ting ni s dng gii php tr tu nhn to. Do vy,
phng php s dng tr tu nhn to c hai khi nim chnh yu l t ng thu nhn ngun
thng tin (kh nng hc) v kh nng thch ng (adaption). Mt gii php thc hin cc
yu cu ny l s dng mng n-ron. Trong phn ny chng ta s tho lun v ng lc ti
sao ngi ta nghin cu v cc mng n-ron v cch m con ngi p dng mng n-ron
vo h thng nhn dng tn hiu ting ni.
Hnh 5.15 l mt m hnh mt h thng hiu c ting ni con ngi. Trong h
thng ny, cc phn tch m thanh c da mt cch khng cht ch vo hiu bit ca chng
ta vo qu trnh x l m trong tai. Cc phn tch c trng khc nhau biu din cho cc qu
trnh x l nhiu mc trong cc ng dy thn kinh ti no. Cc b nh ngn hn v
di hn s cho php iu khin t bn ngoi ca cc qu trnh thn kinh c tin hnh theo
Chng 5: Nhn dng ting ni
82
mt cch m chng ta cha hiu bit r rng. Cu trc tng qut ca m hnh l mt mng kt
ni lan truyn thun hay cn gi l mng n-ron.


Hnh 5.15 S khi tng ca mt h thng hiu ting ni con ngi

Cc mng n-ron nhn to truyn thng (conventional) l cc cu trc dng gii
quyt cc bi ton lin quan n cc mu tnh. Do , c th p dng cho tn hiu ting ni,
mt tn hiu c bn cht ng, chng ta cn c mt s thay i trong cc cu trc mng truyn
thng. Mc d cho n nay cha c mt cch ng n hoc chnh xc gii quyt vn
tnh cht ng ca tn hiu ting ni c bit n, cc nh nghin cu a ra mt s cu
trc chp nhn c, trong phi k n l cu trc mng n-ron vi thi gian tr (TDNN -
Time delay neural network) c m t trong hnh 5.16. Cu trc ny m rng u vo ca
mi phn t tnh ton thm vo N khung tn hiu ting ni (tc l cc vc-t ph s bao
trm khong thi gian N giy, trong l khong thi gian phn tch gia cc thnh
phn ph cnh nhau). Bng vic m rng u vo ti N khung (trong N thng c 15),
cc loi b pht hin acoustic-phonetic khc nhau tr thnh hin thc thng qua mng TDNN.
Mt cu trc mng n-ron khc cho ng dng nhn dng ting ni c trnh by trong
hnh 5.17. Cu trc ny kt hp khi nim mch lc tng hp (matched filter) vi mt mng
n-ron truyn thng gii quyt vn tnh cht ng ca tn hiu ting ni. Cc c trng
m hc ca tn hiu ting ni c c lng thng qua kin trc mng n-ron truyn thng;
b phn loi mu s dng cc vc-t c trng m hc c pht hin (vi tr thch hp)
v chp chng vi cc mch lc tng hp vi cc c trng m hc v cng dn kt qu theo
thi gian. thi im thch hp (tng ng vi thi im cui ca mt s n v ting ni
c pht hin hoc c nhn dng), cc n v u ra din t tn hiu ting ni.

Chng 5: Nhn dng ting ni
83

Hnh 5.16 S khi mt mng TDNN


Hnh 5.17 S khi mt h thng kt hp mng n-ron v mch lc tng hp cho vic nhn
dng ting ni

Cc mng n-ron c xem xt v ng dng rng ri trong nhiu lnh vc bi mt s
l do sau:
- Cc mng n-ron c th d dng thc thi vi cp rt ln cc tnh ton song song.
iu ny l bi v cu trc mng n-ron l mt cu trc c tnh song song cao ca cc thnh
phn tnh ton tng t nhau v n gin.
- Cc mng n-ron k tha bn cht l mt cu trc chu li tt (fault tolerance). V cc
thng tin nhng trong mng c tri (lan) n mi phn t tnh ton trong mng, iu ny
Chng 5: Nhn dng ting ni
84
khin cho cu trc kh tr (least sensitive) vi nhiu hoc cc li khng hon ho bn trong
cu trc.
- Cc trng s kt ni trong mng khng b hn ch l phi c nh, chng c th thay
i theo thi gian thc nng cao cht lng ca h thng. y chnh l khi nim c bn
ca vic hc thch nghi c tnh k tha t cu trc ca mng n-ron.
- Bi v s khng tuyn tnh bn trong mi phn t tnh ton, mt mng c cu trc
ln c th xp x (vi s khc bit nh bt k) mi cu trc khng tuyn tnh hoc h thng
ng khng tuyn tnh. Ni mt cch khc, cc mng n-ron cho php thc hin cc php
bin i khng tuyn tnh gia cc tp u ra v u vo bt k v thng tr ln hiu qu
hn cc phng php thc hin vt l cc bin i khng tuyn tnh khc.
5.6.5 H thng nhn dng da trn m hnh Markov n (HMM)
Hu ht cc h thng nhn dng lin tc hin nay da trn cc m hnh Markov n
(HMM). Mc d nn tng ca cc h thng nhn dng lin tc (CSR) da trn HMM c trc
hng thp k, n gn y mi c c mt s tin b trong vic ci thin cng ngh gim
nh s ph thuc ca cc gi thit c hu v thch ng cc m hnh cho cc ng dng v cc
mi trng nht nh.


Hnh 5.18 S cu trc mt h thng nhn dng ting ni da trn m hnh HMM

Cc thnh phn chnh ca mt h thng CSR lm vic vi b t vng ln c m t
trong hnh 5.18. Dng sng m thanh u vo t mt mi-c-r c chuyn i thnh mt
dy c di c nh cc vc-t m
1
,...,
T
y y = y nh mt qu trnh trch chn mu. B gii
m sau c gng tm kim mt dy t
1
w ,..., w
K
= w c kh nng cao nht to ra y . Ni
cch khc, b gii m c gng gii bi ton:
( )
arg ax | m p = (

w
w w y (5.31)
Tuy nhin, v ( ) | p w y rt kh xc nh trong thc t, do bng cch p dng cng
thc Bayes chng ta c:
( ) ( )
arg ax | m p p = (

w
w y w w (5.32)
tng ng ( ) | p y w c xc nh bng mt m hnh m v xc sut tin nghim
( ) p w c xc nh bng m hnh ngn ng. Trong thc t, m hnh m (acoustic model)
Chng 5: Nhn dng ting ni
85
khng c chun ha v m hnh ngn ng thng c t l bng mt hng s c xc
nh mt cch thc nghim v mt tham s bt li ca vic chn t c thm vo. Ni cch
khc, l-ga-rt ca tng ng tng c tnh bng ( ) ( ) ( ) log | | | p p + + y w w w , trong
l gi tr ph bin trong khong 8-20 v ph bin trong khong t 0 n -20. n v
c bn ca m c biu din bi m hnh m l m v (phone). V d t bat trong ting Anh
gm ba m v l /b/, /ae/ v /t/. i vi ting Anh cn c khong 40 m v nh vy.
Vi mi w cho trc, m hnh m tng ng c tng hp bng cch chp ni cc
m hnh m v to ra cc t nh c quy nh bng mt t in pht m. Cc thm s
ca cc m hnh m v ny c c lng t cc d liu hun luyn bao gm cc dng sng
tn hiu v cc bn ghi h thng chnh t ca chng. M hnh ngn ng thng l mt m
hnh N-gram trong xc sut ca mi t ch ph thuc iu kin vo N-1 thnh phn trc
n. Cc tham s ca m hnh N-gram c c lng bng cch m cc tup N trong mt
tp (corpora: corpus - a collection of recorded utterances used as a basis for the descriptive
analysis of a language) ch thch hp. B gii m hot ng bng cch tm kim qua tt c
cc dy t c th, n s dng phng php cht (prune) loi b cc gi thit gn nh
khng xy ra v bng cch gi cho vic tm kim c th kim sot c. Khi vic tm kim
n tin n phn cui cng, dy t c s tng ng nht chnh l kt qu. Trong cc b gii
m hin i, thay v s dng cc phng php va nu, b gii m sinh ra cc li cha cc
biu din gn ca hu ht cc gi thit c kh nng nht.
a) Trch chn c trng
Nh cp, vic trch chn c trng tm cc to ra mt biu din (thng l dng
m ha) ti u tn hiu ting ni. Qu trnh ny cng phi m bo gim thiu s mt mt
thng tin v to ra mt s ph hp tt nht vi cc gi thit phn tn to ra bi cc m hnh
m. Cc vc-t c trng thng c tnh ton trong mi khung c di khong 10ms v
s dng cc hm ca s phn tch chng ln nhau. Phng php trch trn ph bin nht trong
cc ng dng nhn dng s dng m hnh HMM l phng php MFCC nh trnh by
trong phn trn.
b) Cc m hnh m hc HMM
Nh cp, cc t c pht ra trong w c phn tch thnh mt dy cc m c
bn c gi l cc m v c s. cho php cc thay i pht m c th, tng ng
( ) | p y w c th c tnh trn cc phng n pht m:
( ) ( ) ( ) | | |
Q
p p Q p Q =

y w y w (5.33)
Cc b nhn dng thng xp x cng thc ny bng php tnh cc i do cc
phng php pht m khc nhau c th c gii m nh th chng l cc gi thit t thay th.
Mi Q l mt dy cc pht m ca t
1
Q , ,
K
Q trong mi phng n pht m l mt dy
cc m v c s
( ) ( )
1 2
, ,...
k k
k
Q q q = . Khi chng ta c:
( ) ( )
1
| |
K
k k
k
p Q p Q w
=
=

w (5.34)
Chng 5: Nhn dng ting ni
86
y ( ) |
k k
p Q w l xc sut t
k
w c pht m da trn dy cc m v c s Q.
Trong thc t, ch c rt t s kh nng c th cc phng n pht m
k
Q cho mi t
k
w , iu
ny cho php tng (5.33) d dng kim sot c.


Hnh 5.19 M hnh m v c s da trn m hnh HMM

Mi m c s q c biu din bi mt m hnh Markov n mt lin tc (HMM)
c minh ha trong hnh 5.19. Trong minh ha ny, cc tham s dch chuyn l
ij
{ } a v cc
phn b quan st u ra ( ) { }
j
b . Cc phn b quan st u ra thng l s pha trn ca cc
phn b chun Gausse:
( )
1
; ,
M
j jm jm
m jm
b y c y
=
| |
=
|
\

(5.35)
biu din phn b chun vi gi tr trung bnh
jm
v covariance
jm
. S lng
cc thnh phn trong cng thc (5.35) thng ly trong khong 10 n 20. V kch thc ca
cc vc-t m y thng tng i ln, cc covariance thng c gii hn l cc ma trn
ng cho. Cc trng thi u v kt thc l cc trng thi khng pht x (nonemitting) v
chng c thm vo nhm n gin ha qu trnh chp ni cc m hnh m v to ra cc
t.
Cho trc mt HMM tng hp vi Q c to ra bng cc chp ni tt c cc m v c
s cu thnh, tng ng m c tnh bi:
( ) ( ) | , |
X
p y Q p x y Q =

(5.36)
Trong ( ) ( ) 0 ,..., X x x T = l mt dy cc trng thi trong ton b m hnh tng hp
v
( )
( ) ( ) ( ) ( ) ( ) 0 , 1 , 1
1
, |
T
x x x t x t x t
t
p x y Q a b a
+
=
=

(5.37)
Cc tham s m hnh m
ij
{ } a v ( ) { }
j
b c th c c lng mt cch hiu qu t
tp cc b hun luyn bng phng php cc i k vng.

Chng 5: Nhn dng ting ni
87
5.7. Bi thc hnh nhn dng ting ni
S dng my tnh c nhn v phn mm Matlab (hoc cc ngn ng lp trnh khc) thc
hin cc cng vic sau:
- Xy dng h thng nhn dng ting ni n gin (t vng hn ch) da vo:
o Mng n-ron
o M hnh HMM






























Ph lc 1: Mng n-ron
88
Ph lc 1: Mng n-ron
M u
Hot ng nghin cu v c ch hot ng, cu trc b no con ngi c ch kh sm.
Cng vi s pht trin ca khoa hc, chng ta t c mt s bc tin quan trng trong
lnh vc nghin cu ny. Tuy nhin, b no con ngi l mt t hp rt phc tp v cho n
nay hiu bit ca con ngi v kin trc v hot ng ca no vn cn cha y . Mc d
vy con ngi ta to ra c cc my c mt s tnh nng tng t no nh m phng cc
c im:
- Tri thc thu nhn c nh qu trnh hc
- Tnh nng c c nh kin trc mng v tnh cht kt ni
Cc my m phng ny c tn chung l mng n-ron nhn to hay n gin l mng nron.
c im chnh ca cc mng n-ron:
- Phi tuyn. Cho php x l phi tuyn.
- C ch nh x u vo - u ra cho php hc c gim st.
- C ch thch nghi. Thay i tham s ph hp vi mi trng.
- p ng theo mu hun luyn.
- Thng tin theo ng cnh.Tri thc c biu din tu theo trng thi v kin trc ca
mng.
- Cho php c li (fault tolerance).
- Phng sinh hc
C s mng v N-ron
S mt mng n-ron n gin c minh ha trong hnh A.1. Gi s c N u vo
c nh nhn
1
x ,
2
x , ,
N
x vi cc trng s tng ng l
1
w ,
2
w , , w
N
. Khi quan
h phi tuyn u vo u ra c xc nh nh sau:
1
w
N
i i
i
y f x
=
| |
=
|
\


Trong l mc ngng ni ti hay cn gi l offset, ( ) . f l mt hm phi tuyn.

Hnh A.1: Cu trc n gin ca mt mng n-ron N u vo

Mt s dng ph bin ca f c th c dng nh sau:
1. Hm ngng cng:
( )
1 0
1 0
x
f x
x
+
=

<


2. Hm log-sin
Ph lc 1: Mng n-ron
89
( ) ( )
1
0
1
x
f x
e

= >
+


Cu hnh mng N-ron
Mt yu t quan trng cho vic thit lp v ng dng ca mng n-ron l cu trc t-
p ca mng (network topology). C ba kiu cu trc c bn l:
1) Mng mt tng hoc nhiu tng:


(a)

(b)
Hnh A.2: Cu trc mng n-ron mt tng (a) v hai tng (b)

2) Mng hi quy:


Hnh A.3: Cu trc mng n-ron hi quy

3) Mng t t chc:

Hnh A.4: Cu trc mng n-ron t t chc (SOM) 3x3
Ph lc 2: M hnh Markov n
90
Ph lc 2: M hnh Markov n
Qu trnh Markov
Mt qu trnh ngu nhin X(t) c gi l mt qu trnh Markov nu tng lai ca
mt qu trnh vi trng thi hin ti cho khng ph thuc vo qu kh ca qu trnh. Ni
mt cch khc, vi cc thi gian xc nh
1 2 1
...
k k
t t t t
+
< < < < th:
( ) ( ) ( )
( ) ( )
1 1 1 1
1 1 1
Pr | ,...,
Pr |
k k k k
k k k k
X t x X t x X t x
X t x X t x
+ +
+ +
= = = (

= = = (


Cc gi tr ca ( ) X t ti thi im t thng c gi l trng thi ca qu trnh ti thi
im t.

Chui Markov vi thi gian ri rc
Gi s
n
X l mt chui Markov vi gi tr nguyn v thi gian ri rc vi trng thi
bt u ti n=0 c hm phn b xc sut ri rc (pmf):
( ) [ ] ( )
0
0 Pr 0,1,...
j
p X j j = =
Khi , hm mt phn b xc sut ri rc hp ca n+1 gi tr u tin ca qu trnh
c tnh bng:
[ ]
[ ] [ ] [ ]
0 0
1 1 1 1 0 0 0 0
Pr ,...,
Pr | ... Pr | Pr
n n
n n n n
X i X i
X i X i X i X i X i

= =
= = = = = =

T cng thc trn chng ta thy, hm mt phn b xc sut ri rc hp ca mt dy
xc nh l tch ca xc sut ca trng thi khi u v cc xc sut ca cc dy con chuyn
i trng thi mt bc.
Gi s cc xc sut chuyn i trng thi mt bc l c nh v khng thay i theo
thi gian, ngha l:
[ ]
1 ij
Pr |
n n
X j X i a n
+
= = =
Khi
n
X c ni l c cc xc sut chuyn i ng nht. Khi xc sut phn b
ri rc hp cho
0
,...,
n
X X tr thnh:
[ ] ( )
1 0 1 0
0 0
Pr ,..., ... 0
n n
n n i i i i i
X i X i a a p

= = =
Nh vy,
n
X hon ton c xc nh bi hm mt phn b xc sut ri rc khi
u ( ) 0
i
p v ma trn cc xc sut chuyn mt bc P:
00 01 02
10 11 12
0 1 2
...
...
i i i
a a a
a a a
a a a
(
(
(
( =
(
(
(

P



Ph lc 2: M hnh Markov n
91
Pc gi l ma trn xc sut chuyn. Ch rng, tng ca mi hng ca P phi
bng 1.
Hnh B.1 minh ha s mt chui Markov ri rc vi 5 trng thi c gn nhn S
1

S
5
v cc xc sut chuyn tng ng l nhn cc nhnh
ij
a .

Hnh B.1: Minh ha mt chui Markov ri rc vi 5 trng thi

M hnh Markov n
Trong phn trn chng ta v d v m hnh Markov m mi trng thi tng ng vi
mt s kin (vt l) quan st c. Tuy nhin cc m hnh nh vy c ng dng hn ch
trong cc bi ton thc t. Do , m hnh c m rng bao gm c nhng trng hp vic
quan st l mt hm xc sut ca trng thi - tc l m hnh l mt qu trnh thng k chng
kp vi mt qu trnh thng k bn trong m khng quan st c (n su bn trong), nhng
c th ch quan st c thng qua mt tp cc qu trnh thng k khc, cc qu trnh m to
ra dy cc quan st c. M hnh nh vy c gi l m hnh Markov n (HMM).
minh ha, chng ta xt v d cc m hnh tung ng xu nh sau. Mt ngi thc
hin vic tung ng xu nhng khng ni cho chng ta bit anh ta lm chnh xc nhng g.
Anh ta ch thng bo cho chng ta kt qu ca mi ng xu lt. Nh vy, i vi chng ta,
mt lot cc th nghim tung ng xu c n du, m ch c dy quan st c v n l dy
cc kt qu chn v l. Vn t ra lm sao xy dng mt m hnh HMM thch hp m
hnh dy chn v l quan st c. Vn u tin l vic quyt nh cc trng thi no trong
m hnh tng ng vi v sau l quyt nh bao nhiu trng thi cn thit trong m hnh.
Hnh B.2 minh ha 3 trng hp v d. Trng hp th nht tng ng vi gi thit
ch mt ng xu khng cn c tung. M hnh trong trng hp ny l m hnh hai trng
thi trong mi trng thi tng ng vi mt mt ca ng xu. D thy rng, m hnh
Markov trong trng hp ny l quan st c Cng cn ch rng, chng ta c th s dng
Ph lc 2: M hnh Markov n
92
m hnh Markov mt trng thi trong trng thi tng ng vi mt ng xu khng cn n
l, v tham s cha bit l s khng cn ca ng xu.

Hnh B.2: Minh ha ba m hnh Markov c th i vi th nghim tung ng xu n
Trng hp th hai tng ng vi m hnh hai trng thi trong mi trng thi
tng ng vi mt ng xu khng cn khc nhau c tung. Mi trng thi c c trng
bi mt phn b xc sut ca mt chn v mt l, v cc chuyn i gia cc trng thi c
c trng bi mt ma trn chuyn trng thi.
Trng hp th ba tng ng vi th nghim s dng ba ng xu khng cn khc
nhau, v vic chn mt trong ba ng xu ny c da trn mt s kin xc sut.
Vi mt la chn mt trong ba trng hp trn gii thch dy mt chn v mt l
quan st c, cu hi t ra l m hnh no m phng tng ng nht vi cc quan st thc
t. Chng ta thy rng, m hnh trong trng hp mt ch c mt tham s cha bit, hay ni
cch khc, bc t do ch bng mt. Trong khi cc m hnh trng hp hai v ba c bc t
do tng ng l 4 v 9. Do , vi bc t do ln hn, m hnh HMM ln hn s dng nh
c kh nng hn trong vic m t mt dy cc th nghim tung xu so vi cc m hnh nh hn.
Tuy nhin cng cn ch , iu nhn xt trn l ng v mt l thuyt, trong thc t c mt s
hn ch mnh vi kch thc ca m hnh.
Mt HMM c c trng bi:
1) S cc trng thi trong m hnh N. Mc d cc trng thi l n, nhng vi mt
s ng dng thc t thng c mt s ngha vt l gn vi cc trng thi
hoc mt tp cc trng thi ca m hnh.
Ph lc 2: M hnh Markov n
93
2) S cc k hiu quan st phn bit vi mi trng thi, tc l kch thc b ch
ri rc.
3) Phn b xc sut chuyn trng thi P trong
ij 1
Pr |
n j n i
a X S X S
+
( = = =

,
( ) 1 , i j N . Trong trng hp c bit trong mt trng thi bt k c th
t n bt k trng thi no khc trong mt bc duy nht, chng ta c
ij
0 a >
vi mi i, j. Vi cc loi HMM khc, chng ta c
ij
0 a = cho mt hoc nhiu
hn mt cp (i,j).
4) Phn b xc sut k hiu quan st trng thi j, ( ) { }
j
B b k = , trong
( ) ( ) Pr |
j k t j
b k v t X S ( = =

, ( ) 1 , 1 j N k M .
5) Phn b trng thi khi u { }
i
= trong [ ]
1
Pr
i i
X S = = , ( ) 1 i N .
Vi cc gi tr ca N, M, P, B v cho trc, HMM c th c s dng nh mt b
to cho mt dy quan st
1 2
...
T
O OO O = (vi mi quan st
t
O l mt k hiu t tp v v T l
s cc quan st trong dy) nh sau:
1) Chn mt trng thi khi u
1 i
X S = theo phn b trng thi khi u .
2) t t=1.
3) Chn
t k
O v = theo phn b xc sut k hiu trng thi
i
S , tc l ( )
i
b k .
4) Chuyn sang trng thi mi
1 t j
X S
+
= theo phn b xc sut chuyn trng thi cho
trng thi
j
S , tc l
ij
a .
5) t t=t+1; tr li bc 3 nu t<T; nu khng kt thc qu trnh.




















94
Ti liu tham kho
[1]. John R. Deller, John H. L. Hassen, and John G. Proakis, Discrete-Time Processing
of Speech Signals, Wiley-IEEE Press, 2000.
[2]. Editors: Rainer Martin, Ulrich Heuter and Christiane Antweiler, Advances in
Digital Speech Transmission, Wiley, 2008.
[3]. Lawrence Rabiner and Biing-Hwang Juang, Fundamentals of Speech Recognition,
Prentice-Hall, 1993.
[4]. Editors Jacob Benesty, M. Mohan Sondhi and Yiteng Huang, Handbook of Speech
Processing, Springer-Verlag Berlin, 2008.
[5]. Antonio M. Peinado and Jose C. Segura, Speech Recognition over Digital Channels:
Robustness and Standards, John Wiley \& Sons, 2006.
[6]. John Holmes and Wendy Holmes, Speech Synthesis and Recognition, second
edition, Taylor and Francis, 2001.
[7]. Paul Taylor, Text-to-Speech Synthesis, Cambridge University Press, 2009.
[8]. Lawrence R. Rabiner and Ronald W. Schafer, Introduction to Digital Speech
Processing, Now Publishers Inc., 2007.
[9]. Lawrence R. Rabiner and Ronald Schafer, Digital Processing of Speech Signals,
Prentice-Hall, 1978.
[10]. Sadaoki Furui, Digital Speech Processing, Synthesis, and Recognition, second
edition, Marcel Dekker Inc., 2001.
[11]. Lawrence R. Rabiner, A Tutorial on Hidden Markov Models and Selected
Applications in Speech Recognition, Proceeding of the IEEE, Vol.77, No.2, Feb. 1989,
pp.257-286.

You might also like