You are on page 1of 25

I HC HU TRNG I HC KHOA HC ----------o0o----------

L THUYT NHN DNG

S KT HP CC M HNH (Combining models)

Hu, 12/2009

MC LC

14.1 M hnh Bayesian trung bnh ..................................................................................2 14.2 Committees..............................................................................................................3 14.3 Boosting...................................................................................................................5 14.3.1 Ti thiu ha sai s hm m.........................................................................7 14.3.2 Hm li cho boosting....................................................................................9 14.4 Cc m hnh cy ....................................................................................................12 14.5 Cc m hnh pha trn iu kin.............................................................................16 14.5.1 S pha trn ca cc m hnh hi quy tuyn tnh:.......................................17 14.5.2 S trn ca cc m hnh lgic: ...................................................................21

KT HP CC M HNH
Trong chng trc, chng ta tm hiu cc m hnh khc nhau gii quyt vn phn lp v hi quy. Ta thy rng, hiu sut c th c ci thin bng cch kt hp nhiu m hnh vi nhau theo mt cch no , thay v ch s dng tng m hnh ring l. V d, chng ta hun luyn L m hnh khc nhau, sau s dng trung bnh cng ca cc d on to bi tng m hnh d on. S kt hp cc m hnh nh vy i khi c gi l committees. Trong phn 14.2, chng ta tho lun cc cch p dng khi nim committee trong thc t v cng a ra l do chng t hiu qu ca th tc ny. Mt bin th ca phng php committee c gi l boosting, gm nhiu m hnh hun luyn theo th t m trong hm li c s dng hun luyn mt m hnh c th ty thuc vo hiu qu ca cc m hnh trc . Phng php ny c th em li ci tin quan trng v hiu sut so vi ch s dng mt m hnh duy nht v c tho lun trong phn 14.3 Thay v trung bnh cng d on ca mt tp cc m hnh th mt hnh thc khc kt hp m hnh l chn mt m hnh trong cc m hnh d on, trong , s la chn m hnh l mt hm ca cc bin u vo. Do , cc m hnh khc nhau chu trch nhim d on cc vng khc nhau ca khng gian u vo. Mt framework ca phng php ny c s dng rng ri l cy quyt nh (decision tree) m tin trnh la chn c m t nh l mt chui cc la chn nh phn tng ng vi travelsal ca mt cu trc cy v s c tho lun trong phn 14.4. Trong phn ny, cc m hnh ring l c chn rt n gin v tnh linh hot tng th ca m hnh pht sinh t qu trnh la chn u vo ph thuc (input-dependent). Cy quyt nh c th p dng cho c 2 vn : phn lp v hi quy. Mt gii hn ca cy quyt nh l s phn chia khng gian u vo da vo phn tch cng (hard) m trong mt m hnh chu trch nhim d on cho mt gi tr cho ca bin u vo. Qu trnh quyt nh c th c lm mm bng cch di chuyn khung xc sut cho vic kt hp cc m hnh, nh c tho lun trong phn 14.5. V d, nu chng ta c mt tp K m hnh cho mt phn phi c iu kin p(t|x, k), trong :
1

x l bin u vo. t l bin mc tiu. k = 1..K: ch s m hnh

Sau , chng ta c th hnh thnh mt hn hp xc sut: p(t|x) = trong : k(x) = p(k|x) l h s hn hp u vo ph thuc. Cc m hnh nh vy c th c xem nh l cc phn phi hn hp, trong , mt thnh phn, cc h s hn hp c iu kin trn cc bin u vo v c gi l hn hp ca cc chuyn gia. Chng lin quan cht ch n m hnh mng mt trn c tho lun trong phn 5.6 (14.1)

14.1 M hnh Bayesian trung bnh


Vn quan trng l phn bit gia cc phng php kt hp m hnh v m hnh Bayesian trung bnh. hiu s khc bit, xt v d v d on mt bng cch s dng phng php hn hp Gaussian, trong nhiu thnh phn Gaussian c kt hp theo xc sut. M hnh ny bao gm bin n nh nguyn z, ch ra thnh phn no ca hn hp chu trch nhim sinh im d liu tng ng. Do , m hnh ny c xc nh di dng phn phi ng thi p(x, z) (14.2)

v mt tng ng trn bin quan st x thu c bng cch ly tng trn bin n p(x) = (14.3)

Trong v d phng php hn hp Gaussian, iu ny dn n phn phi dng p(x) = (14.4)

y l v d v s kt hp m hnh. Vi d liu phn phi ng nht, c lp, chng ta c th s dng (14.3) vit xc sut cn bin ca mt tp d liu X = {x1,..., xn} dng p(X) = =
2

(14.5)

T , chng ta thy mi im d liu quan st xn c bin n tng ng zn. By gi gi s chng ta c mt s m hnh khc nhau c nh ch s h = 1,..,H vi xc sut u tin p(h). V d mt trong nhng m hnh c th l mt hn hp theo phn phi Gaussian v mt m hnh khc c th l mt hn hp theo phn phi Cauchy. Phn phi bin trn tp d liu c cho bi p(X) = (14.6)

y l v d m hnh Bayesian trung bnh. Cng thc tnh tng theo h ny ch l mt m hnh chu trch nhim sinh ton b tp d liu v phn phi xc sut trn h ch n gin l phn nh s khng chc chn ca m hnh . Khi kch thc ca tp d liu tng, s khng chc chn ny gim v xc sut hu nghim p(h|X) ngy cng tp trung vo ch mt m hnh trong cc m hnh . iu ny lm ni bt s khc bit chnh gia m hnh Bayesian trung bnh v m hnh kt hp, v trong m hnh Bayesian trung bnh, ton b tp d liu c sinh bi mt m hnh duy nht. Ngc li, khi chng ta kt hp nhiu m hnh, nh trong (14.5), chng ta thy cc im d liu khc nhau trong tp d liu c kh nng c to ra t cc gi tr khc nhau ca bin n z v cc thnh phn khc nhau. Mc d chng ta xem xt xc sut bin p(X), nhng s xem xt ny cng c p dng cho d on mt p(x|X) hoc phn phi c iu kin nh p(t|x, X, T)

14.2 Committees
Cch n gin nht xy dng mt committee l trung bnh cc d on ca tp cc m hnh ring l. Nh mt th tc c th c thc y t mt phi cnh frequentist bng cch xt s nh i (trade-off) gia lch (bias) v phng sai, i lng phn tch li ca mt m hnh nhn vi thnh phn bias xut hin t s khc nhau gia m hnh v hm ng (true) c d on, thnh phn phng sai biu din nhy ca m hnh vi cc im d liu ring l. hnh 3.5, khi chng ta hun luyn nhiu a thc s dng d liu hnh sin v sau ly trung bnh kt qu cc hm th phn phi xut hin t phng sai c xu hng hy b, dn n ci thin cc d on. Khi ta ly trung bnh mt tp cc m hnh lch thp (low-bias) (tng ng vi a thc bc cao), chng ta thu c d on chnh xc cho hm sin m t d liu c sinh ra.
3

D nhin, trong thc t, chng ta ch c mt tp d liu n, v vy chng ta phi tm cch a ra s bin thin gia cc m hnh khc nhau trong committee. Mt cch tip cn l s dng cc tp d liu bootstrap, c tho lun trong phn 1.2.3. Xt vn hi quy trong chng ta c gng d on gi tr ca bin lin tc n, gi s chng ta sinh M tp d liu bootstrap v s dng mi tp d liu hun luyn mt bn sao ring bit ym(x) ca m hnh d on khi m = 1,..,M. D on committee c cho bi: yCOM(x) = (14.7)

Th tc ny c xem l tng hp bootstrap hoc bagging (Breiman, 1996). Gi s rng hm hi quy ng m chng ta ang c gng d on c cho bi h(x), d liu ra ca mi m hnh c th c xc nh bng gi tr ng cng vi sai s: ym(x) = h(x) + m(x) Trung bnh tng bnh phng sai s c tnh bi: Ex [{ym(x) - h(x)}2] = Ex [m(x)2] trong : Ex[] : k vng frequentist theo phn phi ca vect vo x. Sai s trung bnh c to ra khi cc m hnh thc hin ring r. Do EAV = Tng t, sai s mong mun t committee (14.7) c cho bi: ECOM = Ex[{ = Ex[{ }2] }2] (14.11) (14.10) (14.9) (14.8)

Nu chng ta gi s sai s bng 0 v khng b rng buc, khi : Ex [m(x)] = 0 Ex [m(x)l(x)] = 0 , m l Th ta thu c ECOM = EAV
4

(14.12) (14.13)

(14.14)

Kt qu ny cho thy c th gim sai s trung bnh ca mt m hnh bng mt tha s ca M n gin l bng trung bnh M phin bn ca m hnh. Nhng, cn ty thuc vo gi thit quan trng l sai s do cc m hnh ring r khng b rng buc. Trong thc t, sai s thng b rng buc cht ch v vic gim tng li l nh. Tuy nhin, c th xem li committee mong mun s khng vt qu li mong mun ca cc m hnh thnh phn, v vy, ECOM <= EAV. t c ci thin ng k hn, chng ta tin hnh mt k thut tt hn xy dng cc committees, gi l boosting.

14.3 Boosting
Boosting l mt k thut mnh kt hp nhiu b phn lp c s gim mt hnh thc ca committee c hiu sut tt hn ng k hn bt k b phn lp c s no. Chng ta m t hnh thc c s dng rng ri nht ca thut ton boosting, AdaBoost, vit tt ca adaptive boosting, c pht trin bi Freund v Schapire (1996). Boosting c th cho kt qu tt ngay c khi b phn lp c s c hiu sut tt hn ngu nhin khng nhiu, nn i lc c th gi b phn lp c s l weak learner. Ban u ch c thit k gii quyt vn phn lp, nhng boosting cng c th c m rng cho hi quy (Friedman, 2001). S khc bit chnh gia boosting v phng php committee nh bagging ( c tho lun trn) l b phn lp c s c hun luyn theo trnh t v mi b phn lp c hun luyn s dng mt hnh thc trng s (weighted) ca tp d liu trong h s trng s kt hp vi mi im d liu ty thuc vo hiu sut ca b phn lp trc . C th, cc im phn lp sai bi mt trong nhng b phn lp c s cho trng s ln hn khi c s dng hun luyn b phn lp tip theo. Khi tt c cc b phn lp c hun luyn, d on ca chng s c kt hp thng qua lc trng s, nh minh ha trong hnh 14.1 Xem xt vn phn lp hai lp, d liu hun luyn bao gm cc vect vo x1,...,xn cng vi bin mc tiu nh phn tng ng t1,...,tn trong tn {-1,1}. Mi im d liu cho c tham s trng s wn, khi to 1/N cho tt c cc im d liu. Chng ta gi s c th tc cho vic hun luyn mt b phn lp s dng d liu trng s thu c t hm y(x) {-1,1}. Ti mi bc ca thut ton, AdaBoost hun luyn mt b phn lp mi s dng tp d liu trong h s trng s c iu chnh ty
5

thuc hiu sut ca b phn lp c hun luyn trc khi cho trng s ln hn i vi cc im d liu phn lp sai. Cui cng, khi s b phn lp c s mong mun c hun luyn, chng c kt hp hnh thnh mt committee bng cch s dng h s cho trng s khc nhau i vi cc b phn lp c s khc nhau. Thut ton AdaBoosting nh sau:

Hnh 14.1 Lc minh ha ca khung boosting (boosting framework). Mi b phn lp c s ym(x) c hun luyn trn trng s ca tp hun luyn (mu xanh blue) trong mi trng s wnm ph thuc hiu sut ca b phn lp trc ym-1(x) (mu xanh green). Khi tt c cc b phn lp c hun luyn, chng c kt hp li c c b phn lp cui cng YM(x) (mu ) AdaBoost 1. Khi to h s trng s d liu {wn} vi wn(1) = 1/N , n = 1,..,N 2. For m = 1,..,M a. Chn 1 b phn lp ym(x) hun luyn d liu bng cch ti gin hm sai s trng s Jm = (14.15)
Formatted: Bullets and Numbering

trong : I((ym(xn) # tn) l hm ch th, = 1 khi ym(xn) # tn v = 0 nu ngc li. b. nh gi cc i lng m = (14.16)

sau s dng cc gi tr ny tnh m = ln{ } (14.17)

c. Cp nht cc h s trng s d liu wn(m+1) = wn(m)exp{mI(ym(xn) # tn)} 3. Thc hin d on s dng m hnh cui cng, cho bi YM(x) = sign( ) (14.19) (14.18)
Formatted: Bullets and Numbering

Chng ta thy cc b phn lp c s u tin y1(x) c hun luyn s dng h s trng s wn(1) bng nhau, do , tng ng vi th tc thng dng hun luyn b phn lp n. T (14.18), chng ta thy trong cc ln lp tip theo, h s trng s wn(m) tng i vi cc im d liu b phn lp sai v gim i vi cc im d liu phn lp ng. Cc b phn lp k tip sau buc phi t trng tm nhiu hn vo cc im b phn lp sai bi cc b phn lp trc , v cc im d liu tip tc c phn lp sai bi cc b phn lp k tip c trng s cng ln. Cc i lng m i din cho o trng s ca t l sai s ca cc b phn lp c s trn tp d liu. V th, chng ta thy h s trng s m xc nh bi (14.17) c trng s ln hn i vi cc b phn lp chnh xc hn khi tnh ton tng th d liu ra theo (14.19) Thut ton AdaBoost nh minh ha hnh 14.2, s dng mt tp con ca 30 im d liu ly t tp d liu phn lp n gin (toy classification data set) hnh A.7. Mi learner c s gm mt ngng mt trong cc bin u vo. B phn lp n ny tng ng vi mt dng cy quyt nh nh decision stumps (cc gc quyt nh), v d, cy quyt nh vi nt n. V vy, mi learner c s phn lp mt d liu u vo ty theo mt trong nhng tnh cht ca u vo vt qu ngng v s phn chia khng gian thnh 2 phn bng mt mt phng quyt nh tuyn tnh song song vi mt trong cc trc.

14.3.1 Ti thiu ha sai s hm m


Boosting ban u c pht trin s dng trong l thuyt thng k, dn n cc cn trn trn sai s tng qut. Tuy nhin, cc gii hn ny qu yu c c gi tr thc t v hiu sut thc t ca boosting tt hn nhiu so vi gii hn xut. Friedman (2000) a ra cch gii thch khc rt n gin v boosting xt trn vic ti thiu hm sai s m.
7

Hy xem xt hm sai s m c nh ngha nh sau: E= (14.20)

trong fm(x) l mt phn lp l s kt hp tuyn tnh ca cc b phn lp c s y1(m) c dng fm(x) = (14.21)

v tn {-1,1} l gi tr tp hun luyn ch. Mc ch ca chng ta l ti gin E vi c h s trng s 1 v tham s ca cc phn lp c s y1(x)

Hnh 14.2: Minh ha boosting trong learner c s bao gm cc ngng p dng cho mt hoc cc trc khc. Mi hnh din t mt s m ca cc learner c s c hun luyn, cng vi gii hn quyt nh ca learner c s gn nht (ng mu en gch ni) v gii hn quyt nh kt hp ca tp hp (ng xanh lin). Mi im d liu c m t bng mt hnh trn c bn knh cho bit trng s c gn cho im d liu khi hun luyn learner c s c thm vo gn nht. Do , chng ta thy cc im c phn lp sai bi learner c s m = 1 c cho trng s ln hn khi khi hun luyn learner c s m = 2. Tuy nhin, thay v thc hin ti gin hm li ton cc, chng ta gi s cc phn lp y1(x),..., ym-1(x) khng i cng nh cc h s ca chng 1,..., m-1 v chng ta ch ti gin i vi m v ym(x). S phn phi t phn lp c s ym(x), chng ta c th vit hm li di dng E=
8

(14.22)

trong : h s wn(m) = exp{-tnfm-1(xn)} c th c xem nh l mt hng s v chng ta ch ti u m v ym(x). Nu Tm l tp cc im d liu c phn lp ng bi ym(x) v cc im phn lp sai l Mm th chng ta c th vit li hm li nh sau: E = e-m/2 = ( em/2 - e-m/2) + em/2 + e-m/2 (14.23)

Khi chng ta ti gin theo ym(x), chng ta thy phn th 2 l hng s v iu ny tng ng vi ti gin (14.15) v yu t tng th trc tng khng nh hng n ti gin. Tng t, ti gin vi m, chng ta thu c (14.17) trong m c nh ngha bi (14.16). T (14.22), chng ta c m v ym(x), cc trng s trn cc im d liu c cp nht bi wn(m+1) = wn(m) exp{- tnmym(xn)} Thc hin s dng tnym(xn) = 1 - 2 I (ym(xn) = tn) (14.25) (14.24)

Chng ta thy cc trng s wn(m) c cp nht vng lp tip theo s dng wn(m+1) = wn(m) exp(-m/2) exp{mI(ym(xn) # tn)} (14.26)

V exp(-m/2) khng ph thuc vo n, chng ta thy trng s ca tt c cc im d liu c cng nhn t v c th c xa b. Do , chng ta thu c (14.18) Cui cng, khi tt c cc b phn lp c hun luyn, cc im d liu mi c phn lp bi nh gi du ca hm kt hp c nh ngha (14.21). V yu t 1/2 khng nh hng n du nn c th b i, ta c (14.19)

14.3.2 Hm li cho boosting


Hm m li c ti gin bi thut ton AdaBoost khc vi cc hm khc c nu trong cc chng trc. hiu r bn cht ca hm m li, trc tin, chng ta xem xt li mong mun cho bi
9

Ex,t [exp{-ty(x)}] =

(14.27)

Nu chng ta thc hin ti gin bin i vi tt c cc hm kh nng y(x), chng ta thu c y(x) = (14.28)

Hnh 14.3 Hnh v cc hm li m (mu green) v t l cross-entropy (mu ) cng vi bn l (hinge) li c s dng trong cc my vect h tr v li phn lp sai (mu en). Ch rng, gi tr m ln ca z = ty(x), cross-entropy cho kt qu tng tuyn tnh, m s mt mt hm m cho kt qu tng hm m (exponential loss gives an exponentially increasing penalty) l mt na log-odd. Do , thut ton AdaBoost tm kim xp x tt nht cho t l log odd, trong khng gian ca cc hm din t bi s kt hp tuyn tnh ca cc phn lp c s, hng n s ti gin b rng buc kt qu t chin lc ti u tip theo. Kt qu ny thc y vic s dng hm du (14.19) i n quyt nh phn lp cui cng. Chng ta xt y(x) ca li cross-entropy (4.90) cho phn lp hai lp cho bi kh nng lp sau. Trong trng hp bin kt qu t {-1,1}, hm li xc nh bi ln(1 + exp(-yt)). em so snh vi hm m li trong hnh 14.3, m , chng ta chia li cross-entropy bi yu t hng ln(2) nn n i qua im (0, 1) l d dng so snh. Chng ta thy, c hai c th c xem nh l mt xp x lin tc l tng cho hm li
10

phn lp sai. Mt u im ca li m l s ti gin tun t dn n lc AdaBoost n gin. Tuy nhin, mt hn ch l gi tr ph nh ln ca ty(x) mnh hn nhiu so vi cross-entropy. C th, chng ta thy gi tr ph nh ty ln, cross-entropy tng tuyn tnh vi |ty| trong khi hm m li tng theo cp s m vi |ty|. Do , hm m li s t mnh i vi cc im d liu phn lp sai. Mt khc bit quan trng khc l gia cross-entropy v hm m li l hm m li khng th c gii thch nh l hm xc sut ca bt k m hnh xc sut xc nh (well-doned) no. Hn na, li m khng tng qut cho vn phn lp khi K > 2 lp, ngc li vi cross-entropy cho m hnh xc sut l d tng qut (4.108) S gii thch boosting nh l s ti u tun t ca mt m hnh cng (additive model) vi mt li hm m (Friedman, 2000) dn n mt lot boosting, nh cc thut ton, bao gm s m rng nhiu lp, bng cch thay i s la chn ca hm li. N cng thc y m rng cho vn hi quy (Friedman, 2001). Nu ta xem hm li sum-of-squares cho truy hi, sau , s ti gin tun t ca m hnh cng c dng (14.21) lin quan n phn lp c s mi vi cc li cn li tn-fm-1(xn) t m hnh trc. Tuy nhin, nh chng ta bit, li sum-of-squares s t mnh i vi phn ngoi (outlier) v s so snh ny ca sai s bnh phng (mu green) vi sai s tuyt i (mu ) cho bit cch sai s tuyt i t ch trng vo cc sai s ln v v vy mnh hn vi outlier v cc im d liu sai nhn c th c xc nh da vo thut ton boosting trn lch tuyt i |y - t|. Hai hm li ny c so snh trong hnh 14.4

Hnh 14.4 Hnh 14.4 So snh sai s bnh phng (mu xanh) vi sai s tuyt i () hin th
11

14.4 Cc m hnh cy
C nhng m hnh khc nhau khng ng k nhng vn c s dng rng ri, l m hnh hot ng bng cch phn chia khng gian d liu u vo thnh cc hnh khi c cc gc c cn theo cc trc ri gn mt m hnh n (chng hn mt hng s) cho mi vng. N c xem nh l mt phng php kt hp m hnh m ch c mt m hnh chu trch nhim to ra cc d on bt k im d liu cho trc no trong khng gian d liu vo. Cho mt d liu vo mi x, qu trnh chn mt m hnh c th c m t bng quyt nh lin tc to thnh qu trnh tng ng vi ch giao nhau ca cy nh phn (ni tch ra thnh hai nhnh mi nt). y ch tp trung vo cc cy hi quy v phn lp gi l CART (Classification and regression trees), mc d c nhiu bin th khc ca n cng tn nh ID3, C4.5 (Quilan, 1986; Quilan, 1993).

Hnh 14.5 Minh ha khng gian d liu u vo hai chiu c chia thnh 5 vng bng cch s dng cc ng bin cn trc Hnh 14.5 minh ha cho s phn vng nh phn quy ca khng gian d liu u vo cng vi cu trc cy tng ng. Trong v d ny, bc u tin chia ton b khng gian d liu u vo thnh hai vng ty thuc vo x1 1 hay x1 > 1 (trong 1 l tham s ca m hnh. Ta c hai vng con, mi vng c chia nh mt cch c lp. Chng hn nh trong vng x1 1 c chia nh nhiu hn ty thuc vo gi tr x 2 2 hay x 2 > 2 , cho chiu cao ca cc vng k hiu l A v B. S phn on
12

quy c m t bi im giao nhau ca cy nh phn nh trong Hnh 14.6. i vi mi d liu vo mi x, ta xc nh vng m n thuc vo bng cch bt u ti nh ca cy ti nt gc v ch ra ng dn i xung nt l ring bit ph thuc vo tiu chun quyt nh mi nt. Lu rng cy quyt nh khng phi l cc m hnh th theo xc sut.

Hnh 14.6 Cy nh phn tng ng vi vic phn vng ca khng gian d liu u vo nu hnh 14.5 Trong mi vng c mt m hnh tch bit d on bin mc tiu. Chng hn nh, trong s hi quy chng ta s c d on n gin mt hng s trn mi vng hoc trong s phn lp chng ta s ch nh mi vng cho mt lp ring bit. Thuc tnh kha ca cc m hnh cy lm cho n tr nn thng dng trong cc trng v d nh chn on bnh y khoa, con ngi c th sn sng gii thch c bi v n ph hp vi dy cc quyt nh nh phn cung cp cc bin u vo ring l. V d mun chn on bnh ca bnh nhn, u tin ta hi: Nhit c th h c ln hn gi tr cho php khng?. Nu cu tr li l c, sau ta hi tip Huyt p ca h nh th no c b hn so vi gi tr cho php khng?. Mi nt l ca cy c ghp vi mt chn on c th. nghin cu m hnh nh th t tp hun luyn, chng ta phi xc nh c cu trc cy, n bao gm bin u vo c chn mi nt hnh thnh tiu chun
13

phn tch chng hn nh gi tr ca tham s ngng phn nhnh. Chng ta cng xc nh gi tr bin d on trong mi vng. u tin ta xt vn hi quy m mc tiu l d on bin mc tiu n t t vc t D chiu x = (x1, ,xD)T cc bin u vo. D liu hun luyn gm c cc vc t u vo {x1, .,xN} cng vi cc nhn lin k tng ng {t1,.,tN}. Nu s phn vng ca khng gian d liu u vo c cho trc v chng ta ti thiu hm tng bnh phng li th gi tr ti u ca bin d on trong vng bt k vng no cho chnh l gi tr trung bnh ca cc gi tr tn i vi cc im d liu thuc vo vng . Tip theo ta xt cu trc cy quyt nh c xc nh nh th no. Trong trng hp s nt ca cy l c nh, vn xc nh cu trc ti u (bao gm c s chn la bin u vo cho mi nhnh cng nh gi tr ngng tng ng) ti thiu ha tng cc bnh phng li thng khng th thc hin c do kt hp vi mt s lng ln cc vn c th. Ti u ha tham lam c s dng ph bin bng cch bt u bi nt gc n tng ng vi tt c khng gian d liu u vo, sau xy dng cy bng cch thm cc nt vo tng giai on. Ti mi bc s c mt s vng d tuyn trong khng gian d liu u vo c phn tch ng vi cp nt l trn cy hin hnh. Ti y s c mt la chn cc bin u vo D phn tch cng nh gi tr ngng. Ti u ha kt ni ca s la chn vng phn tch v la chn gi tr u vo, gi tr ngng c thc hin mt cch c hiu qu bng cch tm kim ton b cc ghi chp, vi mt la chn cho trc ca ngng v bin phn tch, s la chn ti u ca bin d on c cho bi gi tr trung bnh cc b ca d liu, nh ni ting sm. y c s lp li cc la chn bin phn tch v n chnh l li tng cc bnh phng cn li nh nht c gi li. Cho mt chin lc tham lam trong qu trnh xy dng cy nhng vn cn tn ti mt vn l kt thc vic thm cc nt. C mt phng php n gin kt thc l gim sai s d xung di gi tr ngng. Tuy nhin, n c cn c vo kinh nghim nn khng c mt s phn tch bin no c to ra c ngha gim li, v hin nay vn cha c tm ra mt s phn tch no gim c li thc s. V th, cy c to ra thng kh ln, s dng iu kin dng da vo s im d liu kt hp vi cc nt l v sau xn bt nhng phn tha (xn cy) c c cy kt qu. Vic xn cy da trn tiu chun cn bng sai s d da vo phc tp ca m
14

hnh. Nu chng ta k hiu cy u tin xn l T0 th ta nh ngha T T0 l cy con ca T0 nu n cha cc nt c xn ra t cy T0 (trong nhng trng hp khc b bt cc nt trong bng cch kt hp vi cc vng tng ng). Gi s rng cc nt l c k hiu l = 1,,|T|, vi nt l biu din vng R ca khng gian d liu u vo c N im d liu v |T| k hiu tng s cc nt l. D on ti u i vi vng R c tnh bi:
y = 1 N

xn R

(14.29)

v phn tng ng vi tng cc bnh phng d l:


Q (T ) =
xn R

{t

y }

(14.30)

iu kin xn cy c tnh bi cng thc:


C (T ) = Q (T ) + T
=1
T

(14.31)

Tham s quy tc ha xc nh s cn bng gia tng ton b cc tng bnh phng d v phc tp ca m hnh c xc nh bi s nt l |T| v gi tr c chn bi ch s hp l (cross - validation). i vi vn phn lp, qu trnh xy dng cy v xn cy cng tng t nhng li tng cc bnh phng c thay th bi tiu chun thch hp vi c im ca n. Nu ta gi pk l t l thc ca cc im d liu trong vng R gn cho lp k, vi k = 1,, K th c hai ch s thng c s dng l Entropy
Q (T ) = pk ln pk
k =1 K

(14.32)

Ch s Gini Q (T ) = pk (1 pk )

(14.33)

C hai ch s trn c kh khi pk = 0 hoc khi pk = 1 v c gi tr cc i khi pk = 0.5. Chng ta nn phn vng khi t l thc ca cc im d liu cao c ch nh cho 1 lp. Gi tr Entropy v ch s Gini l hai gi tr tt hn so vi t l khng phn lp xy dng cy bi v n chnh xc hn so vi kh nng xut hin nt. V vy, khng ging nh t l khng phn lp n khc xa v thch hp hn trong vic
15

tnh chnh lch gia cc phng php ti u c bn. Sau vic xn cy, t l khng phn lp c s dng rng ri. M hnh cy nh m hnh CART c xem l mt ci tin tt. Tuy nhin, qu trnh to ra cu trc cy ring bit c nghin cu rt chnh xc ty thuc vo chi tit ca tp d liu v th ch cn mt bin i nh d liu hun luyn cng c th cho kt qu khc. C mt vi vn khc trong phng php da vo cy c a ra trong phn ny. Th nht, vic phn tch c cn theo cc trc ca khng gian c trng c l khng c ti u. Chng hn nh tch 2 lp m ng bin quyt nh ti u xoay quanh trc mt gc 450 s cn mt s lng ln cc s phn tch trc song song ca khng gian d liu u vo khi so snh vi mt s phn tch khng cn trc n. Ngoi ra, s phn tch trong cy quyt nh rt kh, v mi vng ca khng gian d liu u vo ch c kt hp vi mt v ch mt m hnh nt l.

14.5 Cc m hnh pha trn iu kin


Hn ch ca cy quyt nh chun l n rt kh xy dng v s phn tch c cn chnh theo trc ca khng gian d liu u vo. Cc hn ch c th s c gim xung, chi ph ca gi tr chp nhn c bng cch, vic phn chia xc sut c th c thc hin trn ton b bin u vo ch khng ch l mt bin ti mi thi im. Nu ta cho cc m hnh l din t xc sut, ta t c m hnh cy xc sut y gi pha trn phn cp ca cc chuyn gia (hierarchical mixture of experts) c nu ra trong phn 14.5.3 Mt phng php chn la ci tin cc m hnh hierarchical mixture of experts l bt u vi cc pha trn xc sut chun ca cc m hnh t trng khng iu kin nh Gaussian v thay th cc t trng thnh phn bi cc phn phi iu kin. y chng ta xt s pha trn ca cc m hnh hi quy tuyn tnh (Phn 14.5.1) v s pha trn ca cc m hnh hi quy logic (Phn 14.5.2). Trng hp n gin nht, h s pha trn khng ph thuc vo cc bin u vo. Tm li nu cc h s pha trn ph thuc vo cc d liu u vo th chng ta nhn c s pha trn ca m hnh cc chuyn gia. Cui cng nu cho php mi thnh phn trong m hnh hn hp t pha

16

trn vi m hnh chuyn gia th ta s nhn c m hnh pha trn phn cp ca cc chuyn gia.

14.5.1 S pha trn ca cc m hnh hi quy tuyn tnh:


M hnh hi quy tuyn tnh c nhiu thun li trong vic a ra phn tch xc sut v n c th c s dng nh l mt thnh phn trong cc m hnh xc sut phc tp hn. iu c th c thc hin chng hn nh bng cch quan st phn phi iu kin biu din m hnh hi quy tuyn tnh nh l mt nt trong th xc sut c hng. Ta xt mt v d n gin tng ng vi s pha trn ca cc m hnh hi quy tuyn tnh trnh by m hnh pha trn Gaussian m rng c a ra Phn 9.2 trong trng hp cc phn phi Gaussian c iu kin. Ta xt K m hnh hi quy tuyn tnh, mi m hnh c iu chnh theo trng s wk. Trong nhiu ng dng, n c dng tnh phng sai, iu chnh theo tham s d on trong trng hp ton b K m hnh. Chng ta nn hn ch ch n bin mc tiu n t. Nu ta k hiu cc h s pha trn k th phn b hn hp c tnh nh sau:
T p(t | ) = k (t | wk , 1 ) k =1 K

(14.34)

Trong l tp tt c cc tham s thch ng trong m hnh, W = {wk}, = {k} v . Hm hp l (log likelihood function) i vi m hnh ny: Cho trc tp d liu ca s quan st {n, tn}, ta c:
T ln p (t | ) = ln( k (t | wk , 1 )) n =1 k =1 N K

(14.35)

Trong t = (t1,,tN)T l vc t ca cc bin mc tiu. cc i ha hm hp l, s dng thut ton EM, l thut ton m rng n gin ca thut ton EM trong trng hp hn hp Gaussian khng c iu kin Phn 9.2. V th ta xy dng mt th nghim vi mt s trn khng iu kin v a ra mt tp Z = {zn} ca cc bin n nh phn vi znk {0, 1}, i vi mi im d liu n, tt c cc phn t k = 1,,k u bng 0 ngoi tr gi tr n ca 1 ch r thnh phn ca s trn ln chu trch nhim v vic to ra im d liu. S phn phi ng thi
17

trn cc bin c quan st v bin n c th c biu din bi m hnh ha nh c minh ha Hnh 14.7. Hm hp l d liu y (complete data log likelihood fuction) c thnh lp bi:
T ln p (t , Z | ) = z nk ln k N (t n | wk n , 1 ) n =1 k =1 N K

(14.36)

Hnh 14.7 Xc sut th c hng biu din s trn ln ca cc m hnh hi quy tuyn tnh, nh ngha bi 14.35 Thut ton EM bt u bng cch chn gi tr khi to old cho cc tham s mu. bc th E, cc gi tr tham s ny c dng c lng xc sut sau, hoc cc responsibilities, mi thnh phn k trong mi im d liu n cho bi cng thc:
nk = E [z nk ] = p (k | n , old ) =
T k N (t n | wk n , 1 ) j j N (t n | wTj n , 1 )

(14.37)

Cc responsibilities c dng xc nh k vng, i vi phn phi sau


p ( Z | t , old ) ca complete data log likelihood c thnh lp:
T Q( , old ) = E Z [ln p (t , Z | )] = nk ln k + ln N (t n | wk n , 1 ) n =1 k =1 N K

bc th M, ta cc i ha hm Q(, old) i vi tham s , gi nguyn tham s c nh nk. ti u ha i vi h s c nh k ta ly

= 1 m c th

h tr cho cp s nhn Lagrange, t bc M c lng li cng thc tnh k nh sau:


18

k =

1 N nk N n =1

(14.38)

Ch rng cng thc ny ging vi kt qu tng ng vi s trn n gin ca m hnh Gausian khng iu kin c cho bi (9.22). Tip theo ta xt cc i ha i vi vc t tham s wk ca m hnh hi quy tuyn tnh th k. Trong trng hp phn phi Gaussian, ta thy hm Q(, old) l hm ca vc t tham s wk:
N T Q( , old ) = nk (t n wk n ) 2 + const 2 n =1

(14.39)

Trong s hng khng i bao gm cc iu kin t vc t trng s wj vi


j k . Lu rng i lng m chng ta cc i ha tng t (ph nh ca) li tng

cc bnh phng chun (3.12) trong m hnh hi quy tuyn tnh n gin nhng n cha trong responsibilities nk. N din t bi ton cc bnh phng ti thiu trng s tng ng vi iu kin im d liu th n hon v h s trng s cho bi nk, c th c chuyn i chnh xc hp l cho mi im d liu. Ta thy rng mi m hnh hi quy tuyn tnh thnh phn trong hn hp, c qun l bi tham s vc t wk, thch hp vi ton b tp d liu trong bc M nhng mi im d liu n c trng s bng responsibilities nk m m hnh k to ra i vi im d liu. Cho o hm theo wk ca cng thc (14.39) bng 0, t ta c:
T 0 = nk (t n wk n ) n =1 N

(14.40)

Ta c th vit di dng k hiu ma trn nh sau:


0 = T Rk (t wk )

(14.41)

Trong Rk = diag (nk) gi l ma trn cho c kch thc N x N. Gii theo wk, ta c :
wk = ( T Rk ) 1 T Rk t

(14.42)

Phng trnh ny biu din mt tp hp ca cc phng trnh chun c ci tin tng ng vi bi ton cc bnh phng ti thiu trng s ging vi cng thc

19

(4.99) trong phn ph lc v hi quy logic. Lu sau mi bc E, ma trn Rk s thay i v chng ta li phi tnh li cc phng trnh chun sau bc M. Cui cng, ta cc i ha hm Q(, old) theo tham s . Gi nguyn iu kin ph thuc , hm Q(, old) c th c vit nh sau:
N K 1 T Q( , old ) = nk ln (t n wk n ) 2 2 2 n =1 k =1

(14.43)

Ly o hm theo tham s , cho o hm = 0 sp xp li ta c phng trnh bc M theo nh sau:


1

1 N


n =1 k =1

nk

T (t n wk n ) 2

(14.44)

Trong Hnh 14.8, chng ta minh ha thut ton EM s dng v d n gin ca s thch hp pha trn ca 2 ng thng phn tp d liu c mt bin d liu vo x v mt bin mc tiu t. T s d don (14.34) c v trong Hnh 14.9 s dng cc gi tr tham s hi t c t thut ton EM, tng ng bnn phi ca hnh v trong Hnh 14.8. Nhng hin th trong hnh ny l kt qu ca s chnh sa m hnh hi quy tuyn tnh, m hnh ny a ra t s d on unimodal. Ta thy m hnh trn a ra biu din ca phn phi d liu tt hn v y l phn x gi tr hp l. Tuy nhin, m hnh trn cng ch nh khi lng xc sut c ngha cho cc vng khng c d liu bi v phn phi d on bimodal cho tt c cc gi tr ca x. Bi ton ny c th c gii quyt bng cch m rng m hnh cp pht cc h s trn ca x, dn n cc m hnh nh mixture density networks tho lun trong phn 5.6 v trn phn cp ca cc chuyn gia trong phn 14.5.3.

20

Hnh 14.8: V d ca tp d liu gi to c minh ha bng cc im mu xanh, c mt bin vo x v mt bin mc tiu t cng vi s pha trn ca hai m hnh hi quy tuyn tnh m cc hm trung bnh y(x,wk) trong k {1,2} c minh ha bi cc ng xanh v . Ba hnh pha trn ch cu hnh ban u (bn tri), kt qu ca ln lp th 30 ca EM (hnh gia), v kt qu ca ln lp th 50 ca EM (bn phi). y tham s c khi to bi nghch o ca phng sai ca tp cc gi tr mc tiu. Ba hnh pha di ch responsibilities tng ng cc ng gch ng i vi mi im d liu chiu di ca on mu xanh cho xc sut sau ca ng xanh cho im d liu (on mu tng t nh th).

Hnh 14.9: Hnh v bn tri ch t trng iu kin d on tng ng vi vn hi t trong Hnh 14.8. Cho gi tr log likelihood -3.0. Mt lt ct dc thng qua mt trong nhng hnh v ti mi gi tr c th ca x biu din phn phi iu kin tng ng p(t|x) gi l bimodal. Hnh v bn phi hin th t trng d on i vi m hnh hi quy tuyn tnh n thch hp vi tp d liu s dng gi tr hp l cc i. M hnh ny c log likelihood nh hn -27.6

14.5.2 S trn ca cc m hnh lgic:


Bi v m hnh hi quy tuyn tnh nh ngha phn phi iu kin i vi bin mc tiu, cho vc t u vo, y s dng phn phi tng phn trong m hnh trn, do lm tng h thng ca cc phn phi iu kin so snh vi m hnh hi quy lgic n. V d ny bao hm s kt hp hin nhin ca cc kin trong phn trc v s gip chng cng c hn. Phn phi iu kin ca bin mc tiu, trn xc sut ca m hnh hi quy lgic c cho bi:

21

t p (t | , ) = k y k [1 y k ] k =1

1t

(14.45)

T Trong l vc t c trng, y k = ( wk ) l d liu ra ca thnh phn k v

biu din tham s iu chnh c c th l {k} v {wk}. Gi s ta c tp d liu {n,tn}. Hm hp l tng ng l:


N K 1t tn p (t | ) = k y nk [1 y nk ] n n =1 k =1

(14.46)

T Trong y nk = ( wk n ) v t = (t1,,tn)T. Ta cn cc i hm hp l ny bng cch

s dng thut ton EM. N bao gm bin n znk ph hp vi bin ca b ch bo nh phn 1 n K i vi mi im d liu. Hm hp l d liu y c cho bi:
tn p (t , Z | ) = k y nk [1 y nk ] n =1 k =1 N K

1t n

z nk

(14.47)

Vi Z l ma trn ca cc bin n vi cc phn t znk. Ta khi to thut ton EM bng cch chn gi tr khi to old cho cc tham s mu. Ti bc th E, ta dng cc gi tr ca nhng tham s ny c lng xc sut cui ca thnh phn k i vi mi im d liu n, c cho bi cng thc sau:
nk = E [z nk ] = p(k | n , old ) =
t k y nk [1 y nk ]1t 1t t j j y nj [1 y nj ]
n n n

(14.48)

Cc responsibilites c s dng tm k vng complete data log likelihood v hm s ca c tnh:


Q( , old ) = E z [ln p (t , Z | )] = nk {ln k + t n ln y nk + (1 t n ) ln(1 y nk )}
n =1 k =1 N K

(14.49)

Bc th M bao gm vic cc i ha ca hm ny theo tham s , gi nguyn old do nk c nh. Cc i ha theo k c th c thc hin trong phng php thng dng, vi bi s nhn Lagrange rng buc iu kin tng nh sau:

= 1 , kt qu tng t

22

k =

1 N

n =1

nk

(14.50)

xc nh {wk}, ta lu rng hm Q(, old) bao gm mt tng cc hng thc c ch mc bi k, mi trong s ch ph thuc vo cc vector wk, do cc vector khc nhau c tch ri trong bc th M ca thut ton EM. Ni cch khc, cc thnh phn khc nhau ch tng tc thng qua cc mc chu trch nhim, m l c nh qua M bc. Lu rng bc th M khng c mt gii php hon chnh v phi c gii quyt mt cch lp li bng cch s dng thut ton IRLS (iterative reweighted least squares). Gradient v Hessian cho vector wk c cho bi:
k Q = nk (t n y nk ) n
n =1 T H k = k k Q = nk y nk (1 y nk ) n n n =1 N N

(14.51)

(14.52)

Trong k biu din gradient i vi wk. i vi nk c nh, c s c lp ca {wj} vi jk v do ta c th gii quyt tch bit cho mi wk bng cch s dng thut ton IRLS. Nh vy cc cng thc bc M i vi thnh phn k tng ng vi s iu chnh lm khp mt m hnh hi quy logistic n gin thnh mt tp d liu c trng s m trong im d liu n mang mt trng s nk. Cng thc 14.10 cho ta thy mt v d ca vic trn cc m hnh hi quy logistic p dng cho mt vn phn loi n gin.

23

You might also like