You are on page 1of 36

Using HTK

Hidden Markov Model Toolkit

Chi-Yueh Lin
2006/07/17
HTK

HTK (Hidden Markov Model Toolkit) 


    
        
(HMM)    HTK    ! " #
$ %  HMM& ' ( ) * + , - .
/ 0 1
HTK

2 ' ( $ 3 ) * + , - 4 5 6 7 8 9 :
; < = > ?  @ A B C

– 

–          (Speech Corpus) 
  (Transcription/Label files)
–    
 !  "   

HTK

D E F G
– HCopy – # $   % &
– HInit & HRest –  
(Label)
– HCompV & HERest -  
(Transcription)
– HHEd – ' ( ) * + , 

+ H F G
– HParse – - . /  0 1 2 3 4
– HVite –  

HCopy - abstract

HCopy  HTK I J ; <  K L 0 M  


– 5 6 7   8 9 : ; < % &
– % & = > ? 8 9
– # $ @ A B 6 7 C% &
N O P Q R P  S T U ; < I J V W X Y
Z[ \ PCM] ^ T U ; < I V MFCC
HCopy – block diagram

: _ ) * ; <  `a b config ` script c


HCopy – config file
NATURALREADORDER=TRUE
NATURALWRITEORDER=TRUE
1
SOURCEFORMAT=NOHEAD
SOURCEKIND=WAVEFORM
TARGETKIND=MFCC_E_D_A 2
SOURCERATE=1250
TARGETRATE=100000
WINDOWSIZE=250000
3
ZMEANSOURCE=T
USEHAMMING=T
PREEMCOEF=0.97 4
NUMCHANS=31
USEPOWER=F
NUMCEPS=13
5
ENORMALISE=T
LOFREQ=200
HIFREQ=3500 6
DELTAWINDOW=2
ACCWINDOW=2

SAVECOMPRESSED=F
SAVEWITHCRC=F 7
HCopy – config file (1)(2)

NATURALREADORDER=TRUE
NATURALWRITEORDER=TRUE
– 
      TRUE

SOURCEFORMAT=NOHEAD
–    

SOURCEKIND=WAVEFORM
–     
TARGETKIND=MFCC_E_D_A
–  MFCC  energy (E), delta (D), delta-delta (A)
– SOURCEFORMAT     HTK
HCopy – config file (3)

SOURCERATE=1250
–    ! " # $ "  0.125 ms
TARGETRATE=100000
– % & " #  10 ms
WINDOWSIZE=250000
– % & ' (  25 ms

DHTKE
 F G = H I 100ns
HCopy – config file (4)

ZMEANSOURCE=T
– )   * + zero mean, - . DC/
USEHAMMING=T
– 0 1 Hamming Window
PREEMCOEF=0.97
– 2 3 4 5 6  0.97
HCopy – config file (5)

NUMCHANS=31
– 7 Mel 8 ( 9 : ; < 31=> ?
USEPOWER=F
– @ 0 1 c(0)A 6
NUMCEPS=13
– B C 0 1 13DMFCCA 6
HCopy – config file (6)

ENORMALISE=T
– % E F G /H I J K
LOFREQ=200
 

– > ? 9 L M > N
 LOFREQ
HIFREQ=3500  0HIFREQ
– > ? O L M > N 8000

DELTAWINDOW=2
ACCWINDOW=2
–  delta P delta-delta Q R A 6
HCopy – config file (7)

SAVECOMPRESSED=F
– HTKS  T U V W X  A 6    False
SAVEWITHCRC=F
– HTKS  7V W A 6 C Y Z O CRC[ \ ]   
False
HCopy – script file

Script cd e HCopy  f c2 g h  `


i j  k c8 l 2 g h .
m n
/source/dir/001.wav /target/dir/001.mfc
/source/dir/002.wav /target/dir/002.mfc
/source/dir/003.wav /target/dir/003.mfc

o p q G r s o t l sourceu t l target
HCopy - Usage

$ Hcopy -T 1 -C XXX.config -S XXX.script


– -T 1 ! J Trace Level I1K L M  DN O P Q
RhcopyS T
– -C (U V ) W X config Y Z  [ Z M  \
]A ^ [ Z _ J I config C cfg
– -S (U V ) W X script Y
HMM - definition

3-state Left-to-Right HMM

1 2 3 4 5

       

v w x K
HMM - definition

J ` a b HMM

~o J ` observation
<VecSize> 39
– ^ G _ (  39
<MFCC_0_D_A>
– A 6 ` a b  hcopy 
config c d e f
~h “proto” K ! HMM 
prototype cHMM
_ d K e protof - . g
h
HMM - definition

HMMy z  <BeginHMM> { <EndHMM>



| }
<NumStates> 5
– i j 5e k l 3e m n +2e o p
y z ~ $ 3  <Mean> { <Variance>

– <Mean> 39
39 = 0.0
– <Variance> 39
39 = 1.0
HMM - definition

y z Transition Matrix <TransP>

o p k l

o p k l
1.       0.0 1.0  
2.       0.0
3. ! " # $   % & ' ( ) * + , - . / 0 . 01 2
HMM - Training

2  €   D E 4 5  8 ‚ z ƒ t „
… c† ‡ ˆ B ‰ 
– Label :     

0 180000 sil
180000 450000 voc
450000 610000 voc
– Transcription :    

sil
voc
voc
HMM - Training

ƒ t  Label c
–  Hinit + HRest q r
ƒ t  Transcription c
–  HCompV + HERest q r
q Š ‹ u ?
– s Hinit t HRest u v j Label Y w 
HCompV t HERest u v Transcription Y
HMM - Training
HMM - Training

HCompV
– D HTK F x  I flat start
– 5   y ' : & e B z w { | A e } A  Mean
t Variance
HMM - Training

HCompV
– $ HCompV -C config -S script -M dir1 -l aa -o aa
-I label.mlf proto
– config ~  €   ‚ ƒ
SOURCEFORMAT=HTK
SOURCEKIND=MFCC_E_D_A
– script € „ h  Y … †
– -l aa –o aa ‡ { aaHMMY ˆ ‰ : aa
– -M dir1 q ` ‰ €  Y Š ‹
– -I q ` master label fileŒ M  –L q ` Š ‹
– proto s  J `  HMM prototype 
HMM - Training

HERest
– $ HERest -C config -S script -I label.mlf -d dir1
-M dir2 hmmlist
-I g master label fileh S 1 –L g  i
-d j k HMMc  i (lHCompVV W X  )
-M g m n o HMM i
hmmlist d p q r  HMMs t
– Ž ƒ cDdir2- . A e ‘ HMM Macro File
HMM – mixture incrementing

HHEd
– ’V split.hed 
– MU 16 {*.state[2-4].mix}
u =HMMv w x y (2~4); z < 16=mixture
– MU 8 {aa.state[2-4].mix}
aaHMMv w x y (2~4); z < 8=mixture
– $ HHed -M mix2 -w newHMM -d mix1 split.hed
-M mix2 ; z { HMM n 7 mix2  i
-w newHMM ; z { HMMs
-d mix1 7mix1 i j k | } HMM
HMM – mixture incrementing

HHEd
– “ 1 ’ ” • – — - . g h newHMMM 
˜ ™ š e mixture› œ  J : ? y ž Ÿ Ž
ƒHERest &    R? ¡ ¢ £ ¤ ¤ ¥ ¦ I§ ¨
– ¥ ¦  
© ! ª « HMM
K G $ Z
IHMM.model
  
Recognition – Dictionary & WordNet

Dictionary P Œ 2 ‡ d e HTK Ž 8 + ,
 * g  ?  ‘  * g  HMM

’ V .
phone.dic

HMM
3 4 5 6 ai7 8  9 : ; ai
SYM_1 SYM_2 … SYM_N SYM_SHOW
Recognition – Dictionary & WordNet

WordNet d “ + , ” • r – — (
Recognition – Dictionary & WordNet

WordNet
– ¬  WordNet, M ­ ® ‘ ¯
– ° ±  WordNet,  HParse q r
$ci_phone = phone.syn
a |
ai |
an ;
( SENT-START < $ci_phone [sil] > SENT-END ) a

START ai sil END


$ HParse phone.syn phone.net
  phone.net
WordNet
an
Recognition - HVite

˜ WordNet™Dictionary 9 : _ ˜  š P
HVite  + ,
– $ HVite –C config –w phone.net –H HMM.model
phone.dic hmmlist XXX.wav
› € œ  ž i j XXX.rec cŸ & + , —  
0 700000 sil -428.991882
700000 1100000 b -276.103149
1100000 1700000 t -493.964966
1700000 3000000 weng -1099.999023
Recognition - WaveSurfer

   

Recognition - WaveSurfer

² b t weng ³ ´ µ ¶  ”·”: ”¸”+”¹”‚ e 


º j ying ³: » ¼ ½ “¾”
¿ h wang ³: » ¼ ½ “À”
D z ai
 h wei
c h ou
† b yao
Á sh FNULL1
Application

  

      
   
XXX.wav XXX.txt

  HCopy & HVite

 à Ž ƒ
WordNet
HMM Model HMM List
Dictionary

You might also like