Ngnh X l thng tin & truyn thng Thit k h nhng nhn dng ngi ni trn T-Engine SH7760 Sinh vin thc hin : Nguyn Thnh Kin Gio vin hng dn : Ts. Trnh Vn Loan Ni dung trnh by 1. Gii thiu ti. 2. Nhn dng ngi ni. 3. Thit k h nhng T-Engine. 4. Thit k phn mm nhn dng ngi ni. 5. Kt qu t c & hng pht trin.
1. Gii thiu ti 1.1. L do la chn ti. 1.2. Nhim v ca ti. 1.1. L do la chn ti Tng tc gia con ngi v my tnh ngy cng i hi tnh trc quan cao. Ting ni l phng tin giao tip thng dng nht c con ngi s dng.
Yu cu tng tc ngi - my thng qua ging ni l mt nhu cu tt yu.
Bn cnh cc h nhng chuyn dng ngy cng pht trin v c s dng rng ri cho php to ra cc thit b thng minh vi kch thc nh nhng hiu c ting ni con ngi. 1.2. Nhim v ca ti Xy dng chng trnh nhn dng ngi ni s dng m hnh GMM vi t nhn dng bt k. Thit k h nhng da trn chip SH7760 thc hin chng trnh nhn dng. 2. Tng quan nhn dng ngi ni Nhn dng ngi ni c hai dng: nh danh ngi ni (speaker identification) Xc thc ngi ni (speaker verification) 2.1 2.2 2.1. Trch chn c trng Tin x l Phn khung Hm ca s Phng php trch chn c trng MFCC
2.1.1 Tin x l Lc hiu chnh: H(z)=1-az -1 vi 0.95 a < 0.97 Loi b khong lng: Ngng nng lng ca cc khung Threshold = MinValue + Ratio * (MeanValue MinValue) (Ratio ~ 0.3) Pht hin ting ni (Voice activation detection). Da trn cc thng s ca tn hiu: Hm nng lng ngn hn if ((log10(SP) - log10(NP))>g_dblNoiseThreshold) bSpeechFlag = TRUE; 2.1.2 Phn khung Tn hiu ting ni c chia thnh cc khung c kch thc bng nhau.
2.1.3 Hm ca s Ca s Hamming : w(k)=0.54 0.46cos(2k/(k+1)) Ca s Hanning: w(k)=0.5 0.5cos(2k/(k+1))
Ca s Hamming 2.1.4 Trch chn vector c trng Cc c trng c s dng hin nay: Dng h s LPC (LPC- Linear Prediction Coding) Dng cc h s LPL (Perceptional Linear Prediction). Dng h s MFCC (Mel Frequency Cepstral Coefficients) 2.1.4 Trch chn vector c trng Khung ting ni Tin x l + ca s ho |FFT| ph bin Ph lc MEL log ( . ) DCT Kt qu Vector MFCC Lc ph Bng lc Mel Khung ca s View sourcecode 2.2.M hnh hn hp Gauss - GMM 2.2.M hnh hn hp Gauss - GMM M hnh hn hp Gauss l t hp ca nhiu thnh phn, mi thnh phn l mt phn b chun hay phn b Gauss.
Mt hn hp Gauss ) ( ) | ( 1 x b p x p M i i i
= = Trong x
l vector D chiu
) `
E E =
) ( )' ( 2 1 exp ) 2 ( 1 ) ( 1 2 1 2 i i i i D i x x x b t
i
l vector trung bnh
i E i p l ma trn hip bin l trng s ca thnh phn trong hn hp 2.2.M hnh hn hp Gauss - GMM Mt m hnh hn hp Gauss c biu din bng cc tham s (a) s thnh phn Gauss (b) vector trung bnh v ma trn hip bin ca tng thnh phn (c) trng s ca tng thnh phn
B tham s cho mt m hnh Gauss l } { i i i p E = , ,
M i , , 1 = 3. Thit k h nhng T-Engine T-Engine l chun m cho cc h thng nhng thi gian thc c v phn cng v h iu hnh thi gian thc: Phn cng: T-Engine board H iu hnh thi gian thc: T-Kernel S khi mch nhng 4. Thit k phn mm nhn dng ngi ni
Hun luyn m hnh Ngi hun luyn c vo cu hun luyn t 3 n 5 ln
Nhn dng ngi ni t ni bt k Vic nhn dng c thc hin hai ch : Nhn dng thi gian thc Nhn dng xc thc ngi ni Cc gii thut ci thin cht lng nhn dng Xc lp ngng im s nhn dng cho tng ngi ni Sinh t ngu nhin cho hun luyn Nhn dng vi nhiu t khc nhau trong nhiu ln 5. Kt qu t c Xy dng thnh cng h thng nhng nhn dng ngi ni vi t ni bt k chnh xc nhn dng t c 97% Mt s giao din chng trnh Nhp thng tin ngi hun luyn Nhn dng Thit lp ngng cho tng ngi hun luyn Kt qu th nghim H thng c th nghim cho 30 ngi, vi tn s ghi m l 44100Hz, 16bit, mono Mi ngi c cu hun luyn 2 ln, kim tra nhn dng 10 ln vi 10 t bt k. Name Gii tn h Tui a phng S ln c t hun luyn S ln kim tra Kt qu (t l ng) Le Hoai Phuong Nam 23 H Ni 2 10 100% Ngo Chi Minh Nam 23 H Ni 2 10 90% Nguyen Canh Diep Nam 17 Vnh Ph 2 10 100% Nguyen Hai Ha Nam 23 H Ni 2 10 100% Nguyen Ngoc Hung Nam 19 Hi Dng 2 10 100% Nguyen Quang Hiep Nam 23 H Ni 2 10 100% Nguyen Thi Hau N 23 Bc Giang 2 10 90% Nguyen Tien Manh Nam 23 H Ni 2 10 100% Nguyen Xuan Giang Nam 31 H Nam 2 10 100% Pham Thi Nhan N 23 Bc Ninh 2 10 80% Phan Van Diep Nam 23 Ngh An 2 10 100% Tran Manh Linh Nam 23 H Ni 2 10 90% Vuong Quang Hung Nam 18 H Ni 2 10 100% Bui Thi Yen Nu 20 Hanoi 2 10 100% Dang Thi May Nu 20 Nam Dinh 2 10 90% Do Dinh Sy Nam 21 Nam Dinh 2 10 100% Pham Hung Duc Nam 21 Phu Tho 2 10 100% Trinh Xuan Kien Nam 21 Ha noi 2 10 100% Kt qu trung bnh t c 97% Hng pht trin Hin ti, module codec thu m ca mch cn nhiu, phn cng ny s c chun ha li gim nhiu, tng chnh xc nhn dng. B sung thm tham s v tn s c bn F0 cho cc thanh iu vo m hnh nng cao chnh xc nhn dng Cu hi ca hi ng