You are on page 1of 5

NG DNG KIT DSP TMS320C6713 TRONG NHN DNG GING NING DNG TRONG IU KHIN CHUT MY TNH

USING KIT TMS320C6713 FOR SPEECH RECOGNITION APLLYING FOR CONTROL THE MOUSE OF COMPUTER SVTH: Phm c Linh(1), Nguyn Vn Lm(1), L ng Trng(1), Nguyn Hu Trung(2) Lp: (1)07CLC2,(2)08CLC2, Khoa o To K S Cht Lng Cao, Trng i hc Bch Khoa, i hc Nng GVHD: PGS.TS on Quang Vinh Khoa in, Trng i hc Bch Khoa, i hc Nng
TM TT Nhn dng ging ni l mt lnh vc thuc tr tu nhn to, ng vai tr quan trng trong cuc cch mng v vic giao tip gia ngi v my. Cc ng dng ca h thng nhn dng ging ni c s dng rng ri nh: iu khin, bo mt, giao tip, dch thut Bi bo ny trnh by mt phng php nhn dng tn hiu ging ni l HMM (Hidden Markov model). ng dng thnh cng trn KIT TMS320C6713 dng iu khin chut my tnh. ABSTRACT Voice recognition is a field of artificial intelligence, to play an important role in communication between humans and machines. Applications of voice recognition is used widelysuch as: control, security, communication, translation, ..etc.. This paper introduce a method of speech recognition that is HMM (Hidden Markov model). Using KIT TMS320C6713 to control the mouse of computer.

1. t vn Nhn dng ging ni c MacCarthy t nn mng vo Nm 1958, trn th gii cc ng dng ca n trong cuc sng kh ph bin. Ti Vit Nam y vn cn l mt lnh vc mi m, cc nghin cu v nhn dng ging ni vn cn trn l thuyt, t i vo thc t. Mc ch ca ti l nhn dng ging ni v cc ng dng trn KIT TMS320C6713. Mt trong nhng ng dng l iu khin chut my tnh bng ging ni. iu ny gip nhng ngi tn tt c c hi giao tip v s dng cc tnh nng ca my tnh mt cch bnh thng. 2. Phng php v thut ton nhn dng ging ni 2.1.Qu trnh nhn dng
Reference Speech Sample Feature Extraction training Recognition Dection Test Speech Sample Feature Extraction Pattern Matching Recognition Output Computer Mouse

Hnh 1: M t qu trnh nhn dng 1

Nhn chung, mt h thng nhn dng ging ni c th hin nh hnh 1. Bao gm hai giai on chnh: Hun luyn (Training): Tn hiu ting ni c ly mu tn s 8KHz, sau c tch thnh vecto c trng (Featrure Extraction). S dng cc thut ton nh l HMM, DTW hoc VQ hun luyn cc vector ny. Nhn dng (Recognizing): Tin hnh so snh mu ting ni cn c nhn dng vi cc mu c hun luyn trc . Cui cng, mu tn hiu ting ni cn kim tra s c gn nhn giai on Recognition Decision. 2.1.1. Qu trnh trch c trng

Hnh 2: S qu trnh tch c trng s c nhng vo DSP Trch c trng ca tn hiu mu (training) s dng phng php MFCC: Tn hiu m thanh cn nhn dng c thu nhn bng microphone. Tn hiu ting ni l do tn hiu xung bc sng chp vi tn hiu tn s thp do b phn pht m to ra v v chng ta mun ly c tn hiu do b pht m ny. s(n) = e(n) * t(n). ( * : Tch chp) (1) - Tin hnh tin x l: lc nhiu, pht hin t (key word). - Dng hamming Window ct key word thnh cc frame. Thc hin c s dng overlap 63%. Qu trnh ny c gi l windowing. ( ) ( ) ; n = 0 . N 1 (2) - Bin i FFT,Cng thc bin i FFT (Fast Fourier Transform) (3)

S(ejw) l ph ca tn hiu ting ni, E(ejw) l ph xung kch thch v T(ejw) l ph ca b phn pht m. Nh vy mi lin h bin ca ba thnh phn ny l: S(ejw) = E(ejw) . T(ejw) (4)

Log S(ejw) = log E(ejw) + log T(ejw)


jw

(5)

Trong tham gia ca T(e ) th hin vng tn s thp do s bin i chm ca b phn pht m, cn tham gia ca xung bc sng E(ejw) th hin ch yu vng tn s cao. H thng tuyn tnh thng s dng l bin i Fourier ngc ca Log S(ejw) to ra cepstral ca tn hiu (thut ng cepstral l vit ngc ca spectral-ph): = IDFT{Log S(ejw) } = IDFT{ log E(ejw) }+ IDFT{ log T(ejw) } = ce(n) + c(n) (6) - Mel frequency: M t chnh xc s tip nhn tn s ca tai ngi, mt thang tn s c xy dng. Thang tn s Mel da trn c s thc nghim cm nhn nghe ca ngi. cs(n) ( - DCT (thc cht nh l qu trnh IFFT) ( ) ( ) ; k = 0 .. N 1 (7) )

Trong : FMel l tn s sinh l, n v Mel. FHz l tn s thc, n v Hz.

Kt qu sau qu trnh training l cc acoustic vector ca mi frame.Mt tn hiu ting ni bao gm nhiu frame, tp hp cc acountic vector ca cc frame ny gi l codebook v c lu lm c s d liu (database). 2.1.2. Thut ton HMM (Hidden Markov model) M hnh Markov n (ting Anh l Hidden Markov Model - HMM) l m hnh thng k trong h thng c m hnh ha c cho l mt chui Markov vi cc tham s khng bit trc v nhim v l xc nh cc tham s n t cc tham s quan st c, da trn s tha nhn ny. Cc tham s ca m hnh c rt ra sau c th s dng thc hin cc phn tch k tip, v d cho cc ng dng nhn dng mu. Trong mt m hnh Markov in hnh, trng thi c quan st trc tip bi ngi quan st, v v vy cc xc sut chuyn

Hun luyn:

Mu hun luyn

Mt
1 2 3

hai

ba

c lng thng s

Nhn dng:

O=

(/ )

(/ )

(/ )

Hnh 3 : S m hnh HMM 3

tip trng thi l cc tham s duy nht. M hnh Markov n thm vo cc u ra: mi trng thi c xc sut phn b trn cc biu hin u ra c th. V vy, nhn vo dy ca cc biu hin c sinh ra bi HMM khng trc tip ch ra dy cc trng thi. Hnh bn l m t v m hnh ny. 2.1.3. S dng m hnh HMM trong nhn dng ging ni.

1 2
3

33
34 3

44
45 4

55
56 5 6

Sau khi thc hin xong trch c trng, chng ta c mt c s d liu cc vector c ( ) ( ) ( ) ( ) ( ) ( ) trng ng vi tng t. By gi, chng ta s xy Hnh 4: M hnh HMM tri phi vi su dng mt m hnh Markov n vi d liu hun trng thi. luyn l cc vector c trng c c. S hun luyn v nhn dng bng m hnh HMM c th hin trn hnh 3 vi b t vng gm 3 t: mt, hai, ba. ng vi mi t cn nhn dng th chng ta c mt c s d liu cc c trng t cc ln c khc nhau (nh trn s l 3 ln ly mu). Sau ta s c lng cc thng s ca m hnh = (A, B, ) xc sut P(O|) t cc i, tng ng vi mi t l mt xc nh. nhn dng mt t th ta ch vic tnh xc sut chui quan st ca t ng vi cc c hun luyn, v chn mu no c xc sut ln nht. ( ) Da vo cc ti liu tham kho v nhng thng tin v cc h thng nhn dng xy dng thnh cng chng ta thy rng: i vi nhn dng tn hiu ting ni th m hnh HMM thng c chn l m hnh tri phi (left-right) c t 6 trng thi, xem hnh 4. D liu dng luyn thu m ti trung tm MICA l 30 ngi, v vi khong 2000 cu. ti s dng toolkit CMUSphinx ca i hc Carnegie Mellon thc thi trn Linux. M hnh HMM trin khai nhn dng vi 6 trng thi, s lng t l 10 t ( 10 s t 0 n 9) kt qu mang li kh tt vi bng nh bn di. 2.1.4. iu khin chut my tnh Sau khi nhn dng thnh cng key word, bin i key word thnh tn hiu logic a vo iu khin chut my tnh. Trn linux s dng xdotool kt hp vi kt qu nhn dng c iu khin chut. Chut c iu khin kh nhanh v chnh xc.

2.2. Kt qu Nhn dng thnh cng trn m hnh my tnh vi kt qu thc hin nh bng 1. Bng 1: kt qu nhn dng 4 t n Key Word Mt Hai Ba Bn Nm Su By Tm Chn 3. Kt lun ti xy dng thnh cng thut ton nhn dng ging ni v nhn dng hon ton cc s t 0 n 9 vi t l thnh cng tng i tt. iu ny lm bc m thc hin nhng t dng iu khin chut my tnh nh: ln, xung, tri, phi, click. Vi kt qu nhn dng c, ti cng xy dng thnh cng chng trnh iu khin chut my tnh trn Linux. iu ny lm nn tng a ng dng ny xung KIT DSP TMS320C6713. u im: p thi gian thc, kh nng hun luyn c t mi. Chng trnh c th chy trong cc mi trng nhiu khc nhau. Nhng t c hun luyn v nhn dng khng phn bit nam, n v cc ging khc nhau. Hn ch: S lng t nhn dng cn hn ch, nhn dng n l tng t, cha thc hin c vic nhn dng lin tc. Hn na, cha xy dng iu khin chut c trn KIT DSP m mi dng li trn h iu hnh Linux. TI LIU THAM KHO [1] [2] [3] [4] [5] Open Source Toolkit For Speech Recognition - Project by Carnegie Mellon University http://cmusphinx.sourceforge.net/wiki/ L Tin Thng, X L S Tn Hiu v Wavelets Tp 1, Nh xut bn i Hc Quc Gia TPHCM, 2002. TS Nguyn Vn Gip, KS Trn Vit Hng, K thut nhn dng ting ni v ng dng trong iu khin , B mn C in t, khoa C kh, i Hc Bch khoa TPHCM. Nguyn Ph Bnh, Bi ging X l ting ni, i hc Bch khoa H Ni. JOHN WILEY & SONS, Digital Signal Processing and Applications with theC6713 and C6416 DSK, 2004. 5 Xc Xut 92% 90% 95% 92% 96% 93% 90% 97% 89%

You might also like