You are on page 1of 6

Bi tp 1: NHN DNG NGI NI

(SPEAKER RECOGNITION)
MC LC
1

M t bi ton .............................................................................................................. 2

M t phng php lm ............................................................................................... 2

Kt qu t c .......................................................................................................... 3

3.1

Phng php thng k (cch thng k kt qu nhn dng) ................................... 3

3.2

Kt qu thng k (theo chnh xc) .................................................................... 4

M rng: Phng php GMM (Gaussian Mixture Model) ......................................... 4


4.1

M t phng php ................................................................................................ 4

4.1.1

Phng php GMM......................................................................................... 4

4.1.2

Cc bc thc hin thut ton ......................................................................... 4

4.2

Kt qu t c .................................................................................................... 5

4.3

So snh kt qu vi phng php Gauss ............................................................... 6

1 M t bi ton
Cc c trng ca ging ni:
fundamental frequency: c trng v tn s cao(thp) f1, f2, f3
mfcc mel - frequency cepstral coefficient: c trng ca vm ming.
Nhn dng ngi ni da trn c trng MFCC v phng Php GAUSS

2 M t phng php lm
Gm 3 bc chnh:

Bc 1: Chun b data
nhn dng n ngi, to n th mc, mi th mc cha file .wav thu m ging
ni ca tng ngi (30s/file). Trong trng hp mi ngi c nhiu file .wav ta s
ghp ht d liu ca mi ngi thnh mt file duy nht.
Khi ta s c n file cho n ngi (f1,f2,.,fn). y l tp Training Data
Bc 2: Hun luyn cho my bit ging ca tng ngi.
Xy dng n models mi model i din cho mt ngui, tng ng vi vic cn
xy dng n hm Gauss a bin.
M t thut ton xy dng n hm Gauss a bin:
Lp i = 1 ti n // Sau mi bc lp ta s c c hm Gauss ca tng ngi
{
ai = wavread('fi.wav');
bi = mfcc(ai);

// c file wav ca ngi th i


// ly c trng MFCC ca ngi th i

// Xy dng hm Gauss cho ngi i.


i = mean (bi);
= var (bi);
// Tnh Gauss cho tng ngi sau khi c gi tr i v

}
end lp.
Sau vng lp, ta c n hm Gauss (G1, G2, Gn).
Bc 3: Nhn dng ngi ni
Chun b test data: tp cc file test (x1, x2, ...., xz)
Ln lt a tng file xi vo kim tra. Vi mi xi
Tnh bxi = mfcc(xi)
Tnh n gi tr p t n hm Gauss training ( bc 2)
p1 = G1 (bxi; 1, )
p2 = G2(bxi; 2, )
...
pn= Gn (bxi; n, )
Quyt nh Bayes: file xi l ca ngui i ni nu pi ln nht.

3 Kt qu t c
3.1 Phng php thng k (cch thng k kt qu nhn dng)
Test file: train/s1.wav - Recognize: 1
Test file: train/s2.wav - Recognize: 1
Test file: train/s3.wav - Recognize: 1
Test file: train/s4.wav - Recognize: 4
Test file: train/s5.wav - Recognize: 1
Test file: train/s6.wav - Recognize: 1
Test file: train/s7.wav - Recognize: 1
Test file: train/s8.wav - Recognize: 4
Test file: train/s9.wav - Recognize: 10
Test file: train/s10.wav - Recognize: 10
Test file: train/s11.wav - Recognize: 10

Test file: train/s12.wav - Recognize: 4


3.2 Kt qu thng k (theo chnh xc)
ng 25%

4 M rng: Phng php GMM (Gaussian Mixture Model)


4.1 M t phng php
4.1.1 Phng php GMM
M hnh hp Gauss (Gaussian Mixture Model - GMM) l mt dng m hnh thng k
c xy dng t vic hun luyn cc tham s thng qua d liu hc.
V c bn, m hnh GMM xp x mt hm mt xc sut bng hp cc hm mt
Gauss.
Hm mt xc sut ca phn phi Gauss fN (x, , )

Cho trc M phn phi Gauss p1, p2, , pM, hm mt xc sut ca m hnh chnh l
tng trng ca M phn phi Gauss theo cng thc:

Trong :
-

wi l trng s ca phn phi Gauss th i, tha rng buc 0 wi 1 v


. Cc trng s ny th hin mc nh hng ca mi phn phi Gauss i vi
m hnh GMM. Nh vy, phn phi Gauss c phng sai v trng s ln bao
nhiu th c mc nh hng ln by nhiu i vi kt xut ca m hnh.

4.1.2 Cc bc thc hin thut ton


Bc 1: Chun b data

nhn dng n ngi, to n th mc, mi th mc cha file .wav thu m ging


ni ca tng ngi (30s/file). Trong trng hp mi ngi c nhiu file .wav ta s
ghp ht d liu ca mi ngi thnh mt file duy nht.
Khi ta s c n file cho n ngi (f1,f2,.,fn). y l tp Training Data
Bc 2: Hun luyn cho my bit ging ca tng ngi.
Xy dng n models mi model i din cho mt ngui, tng ng vi vic cn
xy dng n hm GMM.
M t thut ton xy dng n hm GMM:
S dng function gmm_evaluate thc hin m hnh thng k bng cch s dng thut
ton GMM nhm tr v gi tr mean , variances v trng lng ca mi model c to
ra.
[mu, sigma, c] = gmm_estimate(X,M,<iT,mu,sigm,c,Vm>)
X: S ct ca ma trn d liu(LxT)
M: s phn phi gaussians (Cho trc)
iT : s ln lp li, mc nh l 10
Bc 3: Nhn dng ngi ni
nhn dng, ta s dng hm lmultigauss: [lYM,lY]=lmultigauss(X, mu, sigma, c)
Tnh ton gi tr Gauss a bin, s dng cc d liu th nghim (X), mean(), ma trn
hip phng sai ng cho ca model (sigma) v ma trn i din cho trng lng ca
m hnh (c).
4.2 Kt qu t c
S dng 5 file training, 5 file test, s lng phn lp Gauss: 5
Ct i tng ng cho d liu test ca ngi th i
Hng i tng ng cho d liu train ca ngi th i
Gi tr nhn s l gi tr ln nht trn ct
A=
-15.0576

-15.6611

-16.8774

-28.7799

-28.4373

-14.9673

-13.9481

-14.4844

-29.7783

-29.9242

-15.2371

-14.1288

-13.0496

-28.0165

-28.3526

-31.7188

-30.9641

-31.9618

-9.9211

-9.5705

-36.5947

-33.8140

-36.5920

-10.2937

-8.6886

Kt qu: ng 100%
S dng 5 file training, 5 file test, s lng phn lp Gauss: 2
A=

-15.6719

-16.5151

-18.0927

-28.1566

-27.0804

-15.3359

-14.8170

-14.8154

-23.9762

-24.0877

-15.4261

-14.7310

-13.8210

-25.6009

-25.6359

-35.6308

-34.3877

-34.7713

-11.2017

-11.1764

-37.4841

-36.0404

-36.9911

-11.8098

-10.5282

Kt qu: ng 60%
4.3 So snh kt qu vi phng php Gauss
H thng GMM
c xy dng trong t c 96.8% hiu sut phn lp trn tp d liu r (clean speech
m thanh thu c t micro cht lng tt) gm 10 ngi ni, mi on test c di 8
giy; v t 94.5% hiu sut phn lp trn tp d liu ting ni telephone gm 5 ngi
ni, mi on test di 8 giy.
M hnh Gauss
M hnh ny t 67.1% hiu sut phn lp trn tp d liu ging ni gm 16 ngi ni,
mi on test c di 5 giy

You might also like