You are on page 1of 20

Computer Based Automatic Speech Processing

MC LC
Mc lc8
..............................................................................trang 1
1. Gii thiu.....................................................................................trang 2
2. MarKov Models (HM).................................................................trang 3
3. Hidden MarKov Models (HMM)- M hnh Markov n..............trang 5
4. Ba bi ton c bn ca HMM......................................................trang 8
5. ng dng HMM trong t ng nhn dng ting ni- ASR.......trang 17
HMM v ng dng HMM trong nhn dng ting ni Trang 1
Computer Based Automatic Speech Processing
1. Gii thiu
Hc thuyt v chui Markov c pht trin vo nhng nm 1900. M hnh
Markov n pht trin vo cui nhng nm 60 v c s dng rng ri trong
lnh vc nhn dng ting ni vo nhng nm 1960-1970 v c a vo khoa
hc my tnh nm 1989
Nhiu bi ton thc t c biu din di mi quan h nhn qu, nhng ch
quan st c phn qu cn phn nhn th n. HMM l mt thut ton cho php
gii quyt cc bi ton xc lp mi nhn qu cc b ni trn.
M hnh Markov n (Hidden Markov Model - HMM) l m hnh thng k
trong h thng c m hnh ha c cho l mt qu trnh Markov vi cc
tham s khng bit trc.
HMM v ng dng HMM trong nhn dng ting ni Trang 2
Computer Based Automatic Speech Processing
Nhim v l xc nh cc tham s n t cc tham s quan st c, da trn
s tha nhn ny. Cc tham s ca m hnh c rt ra sau c th s dng
thc hin cc phn tch k tip.
Cc ng dng ph bin ca m hnh Markov n:
Tin sinh hc (bioinformatics): l mt lnh vc khoa hc s dng cc cng
ngh ca cc ngnh ton hc ng dng, tin hc, thng k, khoa hc my tnh ,
tr tu nhn to, ha hc v ha sinh (biochemistry) gii quyt cc vn
sinh hc
X l tn hiu, phn tch d liu v nhn dng mu
HMMs c dng rt nhiu trong phn tch ngn ng: Nhn dng ting
ni (i tng quan st c: tn hiu m thanh, i tng n: t ng)
Nhn dng ch vit tay (observed: k hiu, hidden: t
ng)
Phn loi v gn th cho t ng (Part-of-speech
tagging) (observed: t ng, hidden: th (danh t, ng t, tnh
t)
H thng dch ngn ng (observed: t nc ngoi,
hidden: t ng ng vi ngn ng cn dch)
2. MarKov Models (HM):
Mt dy trng thi ngu nhin gi l c thuc tnh Markov nu nh xc sut
chuyn sang trng thi tip theo ch ph thuc vo trng thi hin ti v qu kh.
Dy chuyn trng quan st c c gi l chui Markov hay Xch
Markov. Dy chuyn trng khng quan st c gi l m hnh Markov n.
C N trng thi: s1, s2 .. sN
Cc bc thi gian ri rc tng ng: t=0, t=1,
Ti bc thi gian th t, h thng mt trong cc trng thi trn, gi l qt.
Vi q
t
{s1, s2 .. sN }
HMM v ng dng HMM trong nhn dng ting ni Trang 3
S
3
S
1
S
3
N = 3
t=0, q
t
=q
0
=s
3
Trng thi hin ti
Computer Based Automatic Speech Processing
Gia mi bc thi gian, trng thi
tip theo c chn mt cch ngu nhin.
Trng thi hin ti s quyt nh xc xut
phn b ca trng thi tip theo (thng
c k hiu bng vng cung kt ni cc
trng thi).
Trng thi q
t+1
c lp c iu kin vi
{ q
t-1
, q
t-2
, q
1
, q
0
}, c a ra bi qt.
P(A) l xc sut trc hay xc sut b
P(A|B) l xc sut sau hay xc sut c
iu kin, l xc sut xut hin A i vi
B( hay xc xut chuyn tip t B n A)
Mt chui q c gi l chui Markov,
tha thuc tnh ca Markov, trng thi tip
theo ch ph thuc vo trng thi hin ti
v khng ph thuc vo trng thi no
trong qu kh. y c gi l m hnh
Markov bc 1
M hnh Markov bc 2: l m hnh c to ra trn c s trng thi hin ti
qt ph thuc v hai trng thi lin k trc
M hnh Markov n gin cho d bo thi tit
HMM v ng dng HMM trong nhn dng ting ni Trang 4
Computer Based Automatic Speech Processing
Thi tit trong mt ngy c th ri vo mt trong ba trng thi sau:
S1: ma
S2: my m
S3: nng
V d : Xc sut vo (ph hp model) thi tit trong 8
ngy ni tip nhau l "0= mt tri - mt tri - ma - ma - mt
tri - c my - mt tri ?
Cch gii quyt
Chng ta nh ngha tun t vic quan st, O nh :
O = (nng, nng, nng, ma, ma, nng, my,
nng )
= (3, 3, 3, 1, 1, 3, 2, ,3 )
Ngy 1 2 3 4 5 6 7 8
i nhau yu cu 1 b ca iu kin thi tit trn k 8
ngy v chng ta mun tnh ton P (O/ Model) xc sut ca
HMM v ng dng HMM trong nhn dng ting ni Trang 5
0.4 0.3 0.3
{ } 0.2 0.6 0.2
0.1 0.1 0.8
ij
A a
_




,
1
S
2
S
3
S
1
S
2
S
3
S
Computer Based Automatic Speech Processing
vic quan st tun t O, da vo m hnh d bo thi tit nh trn.
Chng ta c th trc tip gy ra P (O/ Model) nh :
1
1
1
]
1

8 . 0 1 . 0 1 . 0
2 . 0 6 . 0 02
3 . 0 3 . 0 4 , 0
) (
ij
a A
P(O/Model) = P [3,3,3,1,1,3,2,3 Model]
= P [3]P[3 3]2P[1 3]P[1 1]
P[3 1]P[2 3]P[3 2]
= 3.(a33)2a31a11a13a32a23
= (1.0)(0.8)2(0.1)(0.4)(0.3)(0.1)(0.2)
= 1.536x10
-4
y chng ta s dng
i = p [ q1 = i] 1 i N
3. Hidden MarKov Models (HMM)- M hnh Markov n
M hnh trc gi s rng mi trng thi c th l duy nht tng ng vi
mt bng chng quan st c.
Khi c c mt quan st, trng thi nhn c ca h thng s tr thnh
v gi tr(khng cn nhiu ngha s dng).
M hnh ny qu hn ch gii quyt cc vn trong trong thc t.
xy dng mt m hnh linh ng hn, chng ta gi s rng nhng quan
st c ca m hnh l mt hm xc xut ca mi trng thi
Mi trng thi c th to ra mt s u ra da trn phn b xc xut v
mi u ra ring bit c th c kh nng c to ra bi mt trng thi no
.
M hnh Markov n (HMM), bi v chui trng thi khng th quan st
trc tip, n ch c th xp x gn ng vi cc chui quan st c h thng
a ra.
Gi s bn c mt my bn nc ngt t ng: n c th 2 trng thi, chn
cola (CP) v chn iced tea (IP), n chuyn trng thi ngu nhin sau mi ln
mua hng, nh sau:
HMM v ng dng HMM trong nhn dng ting ni Trang 6
Ma trn xc xut
u ra
NOT OBSERVABLE
Computer Based Automatic Speech Processing
C 3 u ra quan st c : cola, iced Tea, lemonade
Nh vy m hnh Markov n cho mt my bn nc ngt t ng s l
V d 1: d bo thi tit.
Cc trng thi c th quan st c: Ly li- m t- kh- kh hanh
Cc trng thi n: Nng my m- ma
V d 2: nhn dng ting ni
HMM v ng dng HMM trong nhn dng ting ni Trang 7
Computer Based Automatic Speech Processing
Cc thnh phn ca HMM
q
t
- Trng thi thi im t.
o
t
= (k hiu) Quan st ti thi im t.
= {i} Phn b trng thi ban u
A = {aij} Phn b xc xut chuyn trng thi
B = {bik} Phn b xc xut k hiu quan st c theo
trng thi
HMM c xc nh bi 5 thnh phn
1- Tp hp cc trng thi n: N: s trng thi, S
t
trng thi ti thi gian t
2- Tp hp cc k hiu quan st c, M:s k hiu quan st c
3- Phn b trng thi ban u
HMM v ng dng HMM trong nhn dng ting ni Trang 8
( , , , , ) S O A B
{1, 2,..., } S N
1 2
{ , ,..., }
M
O o o o
0
{ } ( ) 1
i i
P s i i N
Computer Based Automatic Speech Processing
4- Phn b xc xut chuyn trng thi
5- Phn b xc xut k hiu quan st c theo trng thi
Tm li, cc thnh phn ca HMM gm:
2 tham s khng i v kch c: N v M (tng s trng thi v tng s k
hiu quan st c S,O)
3 tp hp phn b xc xut: A, B,
4. Ba bi ton c bn ca HMM
Bi ton 1: (Evaluation problem- Bi ton c lng)
Cho dy quan st O = (o
1
o
2
...o
T
) v HMM - ( hay ) hy xc nh xc sut
sinh dy t m hnh P(O| ).
Bi ton 2: (Decoding problem- Bi ton gii m)
Cho dy quan st O = (o
1
o
2
...o
T
) v HMM- , hy xc nh dy chuyn trng
Q =(q
1
q
2
...q
T
) cho xc sut sinh O ln nht (optimal path).y chnh l bi ton
xc nh dy chuyn trng thi gn ng nht Q =(q
1
q
2
...q
T
) ca m hnh to
ra cc quan st O.
Bi ton 3: (Learning problem- Bi ton hun luyn)
Hiu chnh HMM - cc i ho xc sut sinh X P(O| ) (tm m hnh
khp dy quan st nht.)
Bi ton 1: (Evaluation problem- Bi ton c lng)
Cho dy quan st O = (o
1
o
2
...o
T
) v HMM - ( hay ) hy xc nh xc sut
sinh dy t m hnh P(O| ).
thc hin bi ton ny ta nghin cu thut ton lan truyn xui
Straightforward.
tnh xc xut gn ng P(O| )ca
chui quan st O = (o
1
o
2
...o
T
) ca
HMM- , cch d thy nht l ly
tng xc xut ca tt cc cc chui
trng thi:
p dng gi thuyt Markov:
HMM v ng dng HMM trong nhn dng ting ni Trang 9
1
{ } ( | ), 1 ,
ij ij t t
A a a P s j s i i j N


{ ( )} ( ) ( | ) 1 ,1
j j t k t
B b k b k P X o s j j N k M
Computer Based Automatic Speech Processing
p dng gi thuyt u ra c lp:
HMM v ng dng HMM trong nhn dng ting ni Trang 10
Computer Based Automatic Speech Processing
phc tp thi gian: O(N
2
T)
phc tp khng gian: O(NT)
Thut ton truyn xui ngc
HMM v ng dng HMM trong nhn dng ting ni Trang 11
Computer Based Automatic Speech Processing
Cho cc thng s ntruyn ngc nh bng di
Vy ta c:
Bi ton 2: Thut ton Viterbi(Decoding problem)
Cho dy quan st O = (o
1
o
2
...o
T
) v HMM- , hy xc nh dy chuyn trng
Q =(q
1
q
2
...q
T
) cho xc sut sinh O ln nht (optimal path).y chnh l bi ton
xc nh dy chuyn trng thi gn ng nht Q =(q
1
q
2
...q
T
) ca m hnh to
ra cc quan st O.
Mc tiu ca bi ton ny l ta i tm gi tr maxP(Q|O,) khi c c
chui quan st O = (o
1
o
2
...o
T
) v HMM-.
Quy trnh thc hin thut ton Viterbi thc hin nh sau:
- Dy quan st O
= (o
1
o
2
...o
T
) v
HMM-
HMM v ng dng HMM trong nhn dng ting ni Trang 12
Computer Based Automatic Speech Processing
- ng vi dy chuyn i trng thi Q =(q
1
q
2
...q
T
), Xc sut quan st O =
(o
1
o
2
...o
T
) v HMM- l
HMM v ng dng HMM trong nhn dng ting ni Trang 13
Computer Based Automatic Speech Processing
HMM v ng dng HMM trong nhn dng ting ni Trang 14
Computer Based Automatic Speech Processing
Quy trnh:
Bi ton 3: Thut ton Baum-Welch(Learning problem)
Hiu chnh HMM - cc i ho xc sut sinh Q P(O| ) (tm m hnh
khp dy quan st nht.)
HMM v ng dng HMM trong nhn dng ting ni Trang 15
Computer Based Automatic Speech Processing
K vng tm c dy chuyn trng thi Q theo P(O| )
miu ta li qu trnh tham s HMM, u tin chng ta
phi nh ngha
t
(i,j), kh nng i ti thi im t v j ti im
(t + 1) a ra dng v chui.
HMM v ng dng HMM trong nhn dng ting ni Trang 16
Computer Based Automatic Speech Processing
t (i) l kh nng i ti im t l 1 chui quan st hon ton
v l 1 dng. Chng ta c th ni t (i) vi
t
(i,j) bng cch tnh
qua j
HMM v ng dng HMM trong nhn dng ting ni Trang 17
Computer Based Automatic Speech Processing
5. ng dng HMM trong t ng nhn dng ting ni- ASR
Nhng yu t nh hng n ASR
- Tnh hung khc nhau
- Kiu khc nhau: nhn dng t ring bit d hn nhn dng mt chui t,
nhn dng c d hn nhn dng hi thoi
- Ngi ni ni khc nhau: speaker-independent VS speaker-dependent
- Mi trng khc nhau: nhiu nn
Nhim v ca nhn dng ting ni l nhn u vo sng m thanh v u ra
l chui ca cc t.
Vi mt chui m thanh nhn c O = (o
1
o
2
...o
n
)
Nhim v ca ASR l tm ra chui W = (w
1
w
2
...w
n
) t tng ng c xc
xut posterior P(W|O)
HMM v ng dng HMM trong nhn dng ting ni Trang 18
Computer Based Automatic Speech Processing
Cu trc ca mt m hnh nhn dng ting ni n gin
M hnh thng dng nht dng cho ting ni l constrained (min cng),
cho php mt trng thi chuyn i thnh chnh n hoc thnh mt trng thi
khc
HMM v ng dng HMM trong nhn dng ting ni Trang 19
Acoustic Model
Language
Model
Computer Based Automatic Speech Processing
HMM v ng dng HMM trong nhn dng ting ni Trang 20

You might also like