You are on page 1of 27

Lun vn Thc s

Trng i hc Bch Khoa H Ni


Ngnh X l thng tin & truyn thng
Thit k h nhng nhn dng ngi ni trn
T-Engine SH7760
Sinh vin thc hin : Nguyn Thnh Kin
Gio vin hng dn : Ts. Trnh Vn Loan
Ni dung trnh by
1. Gii thiu ti.
2. Nhn dng ngi ni.
3. Thit k h nhng T-Engine.
4. Thit k phn mm nhn dng ngi ni.
5. Kt qu t c & hng pht trin.

1. Gii thiu ti
1.1. L do la chn ti.
1.2. Nhim v ca ti.
1.1. L do la chn ti
Tng tc gia con ngi v my tnh ngy cng
i hi tnh trc quan cao.
Ting ni l phng tin giao tip thng dng
nht c con ngi s dng.

Yu cu tng tc ngi - my thng qua
ging ni l mt nhu cu tt yu.

Bn cnh cc h nhng chuyn dng ngy
cng pht trin v c s dng rng ri cho
php to ra cc thit b thng minh vi kch
thc nh nhng hiu c ting ni con ngi.
1.2. Nhim v ca ti
Xy dng chng trnh nhn dng ngi
ni s dng m hnh GMM vi t nhn
dng bt k.
Thit k h nhng da trn chip SH7760
thc hin chng trnh nhn dng.
2. Tng quan nhn dng ngi ni
Nhn dng ngi ni c hai dng:
nh danh ngi ni (speaker identification)
Xc thc ngi ni (speaker verification)
2.1
2.2
2.1. Trch chn c trng
Tin x l
Phn khung
Hm ca s
Phng php trch chn c trng MFCC



2.1.1 Tin x l
Lc hiu chnh:
H(z)=1-az
-1
vi 0.95 a < 0.97
Loi b khong lng:
Ngng nng lng ca cc khung
Threshold = MinValue + Ratio * (MeanValue MinValue)
(Ratio ~ 0.3)
Pht hin ting ni (Voice activation detection).
Da trn cc thng s ca tn hiu:
Hm nng lng ngn hn
if ((log10(SP) - log10(NP))>g_dblNoiseThreshold)
bSpeechFlag = TRUE;
2.1.2 Phn khung
Tn hiu ting ni c chia thnh cc khung c
kch thc bng nhau.

2.1.3 Hm ca s
Ca s Hamming :
w(k)=0.54 0.46cos(2k/(k+1))
Ca s Hanning:
w(k)=0.5 0.5cos(2k/(k+1))



Ca s Hamming
2.1.4 Trch chn vector c trng
Cc c trng c s dng hin nay:
Dng h s LPC (LPC- Linear Prediction
Coding)
Dng cc h s LPL (Perceptional Linear
Prediction).
Dng h s MFCC (Mel Frequency Cepstral
Coefficients)
2.1.4 Trch chn vector c trng
Khung ting ni
Tin x l
+ ca s ho
|FFT|
ph bin
Ph lc MEL
log ( . )
DCT
Kt qu
Vector MFCC
Lc ph
Bng lc Mel
Khung ca s
View sourcecode
2.2.M hnh hn hp Gauss - GMM
2.2.M hnh hn hp Gauss - GMM
M hnh hn hp Gauss l t hp ca nhiu
thnh phn, mi thnh phn l mt phn b
chun hay phn b Gauss.

Mt hn hp Gauss
) ( ) | (
1
x b p x p
M
i
i i

=
=
Trong
x

l vector D chiu

)
`

E
E
=

) ( )' (
2
1
exp
) 2 (
1
) (
1
2 1
2
i i i
i
D
i
x x x b
t

i

l vector trung bnh


i
E
i
p
l ma trn hip bin
l trng s ca thnh phn trong hn hp
2.2.M hnh hn hp Gauss - GMM
Mt m hnh hn hp Gauss c biu din
bng cc tham s
(a) s thnh phn Gauss
(b) vector trung bnh v ma trn hip bin ca
tng thnh phn
(c) trng s ca tng thnh phn

B tham s cho mt m hnh Gauss l
} {
i i i
p E = , ,

M i , , 1 =
3. Thit k h nhng T-Engine
T-Engine l chun m cho cc h thng
nhng thi gian thc c v phn cng v
h iu hnh thi gian thc:
Phn cng: T-Engine board
H iu hnh thi gian thc: T-Kernel
S khi mch nhng
4. Thit k phn mm nhn dng ngi ni

Hun luyn m hnh
Ngi hun
luyn c vo
cu hun luyn
t 3 n 5 ln

Nhn dng ngi ni t ni bt k
Vic nhn dng
c thc hin
hai ch :
Nhn dng thi
gian thc
Nhn dng xc
thc ngi ni
Cc gii thut ci thin cht
lng nhn dng
Xc lp ngng im s nhn dng cho
tng ngi ni
Sinh t ngu nhin cho hun luyn
Nhn dng vi nhiu t khc nhau trong
nhiu ln
5. Kt qu t c
Xy dng thnh cng
h thng nhng nhn
dng ngi ni vi t
ni bt k
chnh xc nhn
dng t c 97%
Mt s giao din chng trnh
Nhp thng tin
ngi hun luyn
Nhn dng
Thit lp ngng
cho tng ngi
hun luyn
Kt qu th nghim
H thng c th nghim cho 30 ngi,
vi tn s ghi m l 44100Hz, 16bit, mono
Mi ngi c cu hun luyn 2 ln, kim
tra nhn dng 10 ln vi 10 t bt k.
Name Gii
tn
h
Tui a phng S ln c t
hun luyn
S ln
kim tra
Kt qu
(t l ng)
Le Hoai Phuong Nam 23 H Ni 2 10 100%
Ngo Chi Minh Nam 23 H Ni 2 10 90%
Nguyen Canh Diep Nam 17 Vnh Ph 2 10 100%
Nguyen Hai Ha Nam 23 H Ni 2 10 100%
Nguyen Ngoc Hung Nam 19 Hi Dng 2 10 100%
Nguyen Quang Hiep Nam 23 H Ni 2 10 100%
Nguyen Thi Hau N 23 Bc Giang 2 10 90%
Nguyen Tien Manh Nam 23 H Ni 2 10 100%
Nguyen Xuan Giang Nam 31 H Nam 2 10 100%
Pham Thi Nhan N 23 Bc Ninh 2 10 80%
Phan Van Diep Nam 23 Ngh An 2 10 100%
Tran Manh Linh Nam 23 H Ni 2 10 90%
Vuong Quang Hung Nam 18 H Ni 2 10 100%
Bui Thi Yen Nu 20 Hanoi 2 10 100%
Dang Thi May Nu 20 Nam Dinh 2 10 90%
Do Dinh Sy Nam 21 Nam Dinh 2 10 100%
Pham Hung Duc Nam 21 Phu Tho 2 10 100%
Trinh Xuan Kien Nam 21 Ha noi 2 10 100%
Kt qu trung bnh t c 97%
Hng pht trin
Hin ti, module codec thu m ca mch
cn nhiu, phn cng ny s c chun
ha li gim nhiu, tng chnh xc
nhn dng.
B sung thm tham s v tn s c bn
F0 cho cc thanh iu vo m hnh
nng cao chnh xc nhn dng
Cu hi ca hi ng

Em xin chn thnh cm n!

You might also like