You are on page 1of 12

Support Vector Machine

1. Support Vector Machine:


Support Vector Machine (SVM) l mt phung php phn lp da trn l thuyt hc thng k,
c xut bi Vapnik (1995).
n gin ta s xt bi ton phn lp nh phn, sau s m rng vn ra cho bi ton phn
nhiu lp.
Xt mt v d ca bi ton phn lp nh hnh v; ta phi tm mt ng thng sao cho bn
tri n ton l cc im , bn phi n ton l cc im xanh. Bi ton m dng ng thng
phn chia ny c gi l phn lp tuyn tnh (linear classification).

Hm tuyn tnh phn bit hai lp nh sau:


y ( x )=wT ( x ) +b

(1)

Trong :

wR

l vector trng s hay vector chun ca siu phng phn cch, T l k hiu

chuyn v.
b R l lch
( x ) Rm

l vc t c trng,

lm hm nh x t khng gian u vo sang

khng gian c trng.


Tp d liu u vo gm N mu input vector {x1, x2,...,xN}, vi cc gi tr nhn tng ng l {t1,
t {1,1}
,tN} trong n
.
Lu cch dng t y: im d liu, mu u c hiu l input vector x i; nu l khng
gian 2 chiu th ng phn cch l ng thng, nhng trong khng gian a chiu th gi l
siu phng.

Gi s tp d liu ca ta c th phn tch tuyn tnh hon ton (cc mu u c phn ng


lp) trong khng gian c trng (feature space), do s tn ti gi tr tham s w v b theo (1)
y ( xn ) > 0
t n=+1
y ( xn ) < 0
tha
cho nhng im c nhn
v
cho nhng im c
t n=1

, v th m

t n y ( x n ) >0

cho mi im d liu hun luyn.

SVM tip cn gii quyt vn ny thng qua khi nim gi l l, ng bin (margin). L
c chn l khong cch nh nht t ng phn cch n mi im d liu hay l khong
cch t ng phn cch n nhng im gn nht.

Trong SVM, ng phn lp tt nht chnh l ng c khong cch margin ln nht (tc l s
tn ti rt nhiu ng phn cch xoay theo cc phng khc nhau, v ta chn ra ng phn
cch m c khong cch margin l ln nht).

Ta c cng thc tnh khong cch t im d liu n mt phn cch nh sau:

| y ( x )|
w

Do ta ang xt trong trng hp cc im d liu u c phn lp ng nn

t n y ( x n ) >0

cho

mi n. V th khong cch t im xn n mt phn cch c vit li nh sau:


T

t n y ( x n) t n ( w ( x n ) +b)
=
w
w

(2)

L l khong cch vung gc n im d liu gn nht x n t tp d liu, v chng ta mun tm


gi tr ti u ca w v b bng cch cc i khong cch ny. Vn cn gii quyt s c vit
li di dng cng thc sau:
arg max
w,b

Chng ta c th em nhn t

1
w

1
min [ t n (wT ( x n ) +b)]
w n

(3)

ra ngoi bi v w khng ph thuc n. Gii quyt vn

ny mt cch trc tip s rt phc tp, do ta s chuyn n v mt vn tng ng d gii


quyt hn. Ta s scale w w v b b cho mi im d liu, t y khong cch l
tr thnh 1, vic bin i ny khng lm thay i bn cht vn .

t n ( wT ( x n ) +b )=1

(4)

T by gi, cc im d liu s tha rng buc:


t n ( wT ( x n ) +b ) 1, n=1, , N

Vn ti u yu cu ta cc i

c chuyn thnh cc tiu

(5)
2

w , ta vit li cng

thc:
1
2
arg min w
w ,b 2

(6)

Vic nhn h s s gip thun li cho ly o hm v sau.


L thuyt Nhn t Lagrange:
Vn cc i hm f(x) tha iu kin

g( x) 0

s c vit li di dng ti u ca hm

Lagrange nh sau:
L ( x , ) f ( x ) + g(x )

Trong x v phi tha iu kin Karush-Kuhn-Tucker (KKT) nh sau:


g( x) 0
0
g ( x ) =0
Nu l cc tiu hm f(x) th hm Lagrange s l
L ( x , ) f ( x ) g (x)

gii quyt bi ton trn, ta vit li theo hm Lagrange nh sau:


N

2
1
T
L ( w , b , a )= w a n {t n ( w ( x n )+ b ) 1 }
2
n=1

(7)

Trong a=(a 1 , , a N )

l nhn t Lagrange.

Lu du () trong hm Lagrange, bi v ta cc tiu theo bin w v b, v l cc i theo bin a.


Ly o hm L(w,b,a) theo w v b ta c:
N

L
=0 w= an t n ( x n )
w
n=1

(8)

L
=0 0= an t n
b
n=1

(9)

Loi b w v b ra khi L(w,b,a) bng cch th (8), (9) vo. iu ny s dn ta n vn ti u:


N

1
~
L ( a )= an an am t n t m k ( x n , x m )
2 n=1 m=1
n=1

(10)

an 0, n=1, , N

(11)

Tha cc rng buc:

a n t n=0
n=1

(12)

T
y hm nhn (kernel function) c nh ngha l k ( x n , x m ) = ( xn ) ( x m ) .

Vn tm thi gc li y, ta s tho lun k thut gii quyt (10) tha (11), (12) ny sau.
phn lp cho 1 im d liu mi dng m hnh hun luyn, ta tnh du ca y(x) theo cng
thc (1), nhng th w trong (8) vo:
N

y ( x )= an t n k ( x , x n ) + b

(13)

an 0

(14)

n=1

Tha cc iu kin KKT sau:

t n y ( x n )1 0

(15)

an {t n y ( x n ) 1 }=0

(16)

V th vi mi im d liu, hoc l
an =0

an =0

hoc l

t n y ( x n )=1

. Nhng im d liu m c

s khng xut hin trong (13) v do m khng ng gp trong vic d on im d

liu mi.
Nhng im d liu cn li

a
( n 0)

c gi l support vector, chng tha

t n y ( x n )=1

l nhng im nm trn l ca siu phng trong khng gian c trng.


Support vector chnh l ci m ta quan tm trong qu trnh hun luyn ca SVM. Vic phn lp
cho mt im d liu mi s ch ph thuc vo cc support vector.
Gi s rng ta gii quyt c vn (10) v tm c gi tr nhn t a, by gi ta cn xc
t y x =1
nh tham s b da vo cc support vector xn c n ( n )
. Th (13) vo:
tn

( a t
m S

m m

k ( x n , x m ) +b =1

(17)

Trong S l tp cc support vector. Mc d ta ch cn th mt im support vector x n vo l c


th tm ra b, nhng m bo tnh n nh ca b ta s tnh b theo cch ly gi tr trung bnh da
trn cc support vector.
2
u tin ta nhn tn vo (17) (lu t n=1 , v gi tr b s l:

b=

1
t n am t m k ( xn , x m )
NS
n S
m S

(18)

Trong Ns l tng s support vector.


Ban u d trnh by thut ton ta gi s l cc im d liu c th phn tch hon ton
trong khng gian c trng ( x ) . Nhng vic phn tch hon ton ny c th dn n kh
nng tng qut ha km, v thc t mt s mu trong qu trnh thu thp d liu c th b gn
nhn sai, nu ta c tnh phn tch hon ton s lm cho m hnh d on qu khp.

chng li s qu khp, chng ta chp nhn cho mt vi im b phn lp sai.


lm iu ny, ta dng cc bin slack variables

n 0, vi n=1, , N

n=0

n=t n y (x n) cho nhng im cn li.

Do nhng im nm trn ng phn cch

Cn nhng im phn lp sai s c n >1

cho mi im d liu.

cho nhng im nm trn l hoc pha trong ca l

y ( xn ) =0 s c n=1

Cng thc (5) s vit li nh sau:


t n y ( x n ) 1n , n=1, , N

(20)

Mc tiu ca ta by gi l cc i khong cch l, nhng ng thi cng m bo tnh mm


mng cho nhng im b phn lp sai. Ta vit li vn cn cc tiu:
N

2
1
C n + w
2
n=1

(21)

Trong C > 0 ng vai tr quyt nh t tm quan trng vo bin


By gi chng ta cn cc tiu (21) tha rng buc (20) v
N

n 0

{a n 0 }

{n 0 }

hay l l.

. Theo Lagrange ta vit li:

2
1
L ( w , b , a )= w +C n an {t n y ( x n )1+ n } n n
2
n=1
n=1
n=1

Trong

(22)

l cc nhn t Lagrange.

Cc iu kin KKT cn tha l:


an 0

(23)

t n y ( x n )1+ n 0

(24)

an ( t n y ( x n ) 1+ n) =0

(25)

n 0

(26)

n 0

(27)

n n=0

(28)

Vi n = 1,,N
Ly o hm (22) theo w, b v {

}:
N

L
=0 w= an t n ( x n )
w
n=1

(29)

L
=0 an t n=0
b
n=1

(30)

L
=0 an =C n
n

(31)

Th (29), (30), (31) vo (22) ta c:


N

1
~
L ( a )= an an am t n t m k ( x n , x m )
2 n=1 m=1
n=1
T (23), (26) v (31) ta c:

(32)

an C

Vn cn ti u ging ht vi trng hp phn tch hon ton, ch c iu kin rng buc khc
bit nh sau:
0 an C

(33)

a n t n=0

(34)

n=1

Th (29) vo (1), ta s thy d on cho mt im d liu mi tng t nh (13).


Nh trc , tp cc im c

an =0

khng c ng gp g cho vic d on im d liu

mi.
an >0

Nhng im cn li to thnh cc support vector. Nhng im c

v theo (25) tha:

t n y ( x n )=1 n
Nu

an <C

theo (31) c

Nhng im c
nu

n 1

an =C

n >0

, t (28) suy ra

(35)
n=0

v l nhng im nm trn l.

c th l nhng im phn lp ng nm gia l v ng phn cch


n >1

hoc c th l phn lp sai nu

xc nh tham s b trong (1) ta s dng nhng support vector m


th

t n y ( x n )=1

0<a n< C

n=0

:
tn

( a t
m S

m m

k ( x n , x m ) +b =1

(36)

Ln na, m bo tnh n nh ca b ta tnh theo trung bnh:


b=

Trong M l tp cc im c

1
t n am t m k ( x n , x m )
N M n
M
m S

(37)

0<a n< C

gii quyt (10) v (32) ta dng thut ton Sequential Minimal Optimization (SMO) do Platt
a ra vo 1999.

2. MultiClass SVMs:

By gi xt n trng hp phn nhiu lp K > 2. Chng ta c th xy dng vic phn K-class


da trn vic kt hp mt s ng phn 2 lp. Tuy nhin, iu ny s dn n mt vi kh khn
(theo Duda and Hart, 1973).
Hng one-versus-the-rest, ta s dng K-1 b phn lp nh phn xy dng K-class.
Hng one-versus-one, dng K(K-1)/2 b phn lp nh phn xy dng K-class.
C 2 hng u dn n vng mp m trong phn lp (nh hnh v).
Ta c th trnh c vn ny bng cch xy dng K-Class da trn K hm tuyn tnh c dng:
y k ( x ) =wTk x +w k 0
V mt im x c gn vo lp Ck khi

y k ( x ) > y j ( x) vi mi

jk .

Mt hng tip cn khc do Wu (2004) xut phng php c lng xc sut cho vic phn
m lp.

3. p dng cho bi ton phn loi vn bn:


Hng dn ci t:
M t vector c trng ca vn bn: L vector c s chiu l s c trng trong ton tp d liu,
cc c trng ny i mt khc nhau. Nu vn bn c cha c trng s c gi tr 1, ngc li
l 0.

Vic ci t SVM kh phc tp ta nn dng cc th vin ci sn trn mng nh LibSVM,


SVMLight.
Thut ton gm 2 giai on hun luyn v phn lp:
1. Hun luyn:
u vo:
Cc vector c trng ca vn bn trong tp hun luyn (Ma trn MxN, vi M l s vector
c trng trong tp hun luyn, N l s c trng ca vector).
Tp nhn/lp cho tng vector c trng ca tp hun luyn.
Cc tham s cho m hnh SVM: C, (tham s ca hm kernel, thng dng hm
Gauss)
u ra:
M hnh SVM (Cc Support Vector, nhn t Lagrange a, tham s b).
2. Phn lp:
u vo:
Vector c trng ca vn bn cn phn lp.
M hnh SVM
u ra:
Nhn/lp ca vn bn cn phn loi.

4. Ti liu tham kho:


[1] Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer (2007)
.

You might also like