You are on page 1of 40

Mt s v d phn loi dng SOM v MLP Neural Network

Cao Thng, 2011

Ti liu ny hng dn cc bn s dng mng n ron trong mt s ng dng thc t. Ti liu ny cng c dng trnh by vi mt s bn sinh vin
Nht bn. Tc gi thy n c th c ch cho cc bn sinh vin khc nn son li cc bn tham kho. Hy vng gip ch g cho cc bn. Do khng
c nhiu thi gian bin son nn c th c li, mong bn c thng cm v ng gp kin.
Bn c c th lin h vi tc gi ti thawngc AT gmail DOT com, hoc ti www.facebook/spiceneuro/
Nu quan tm ti mng n ron, cc bn c th dng phn mm SpiceSOM v SpiceMLP, download ti http://spice.ci.ritsumei.ac.jp/~thangc/programs
Mt s d liu trnh by y c sn trong th mc Data khi ci t phn mm SpiceSOM v SpiceMLP. Tt c cc kt qu trnh by y u c s
dng bng SpiceSOM v SpiceMLP.
Cm n cc bn.
MC LC

1. Iris Flower Data Set................................................................................................................................................................................................ 2


2. Students's Score Data ............................................................................................................................................................................................. 7
3. Face/Nonface Classification ................................................................................................................................................................................. 10
4. Phn loi nh ngi i b ..................................................................................................................................................................................... 14
5. Phn loi nh xe hi ............................................................................................................................................................................................. 21
6. D bo chng khon ............................................................................................................................................................................................ 22
6. D bo chng khon ............................................................................................................................................................................................ 23
7. D bo t gi........................................................................................................................................................................................................ 31
8. D bo lu lng nc h Ha Bnh .................................................................................................................................................................... 35
9. Kt lun ............................................................................................................................................................................................................... 40

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

1. Iris Flower Data Set


Iris dataset bao gm d liu ca ba loi hoa (Iris setosa, Iris virginica v Iris versicolor), mi loi 50 mu. Cc thuc tnh l di v rng ca i hoa
(sepal) v cnh hoa (petal) tnh theo centimeters.
Chi tit ti http://archive.ics.uci.edu/ml/datasets/Iris

Iris setosa

Iris versicolor

Iris virginica
Hnh 1. Hnh minh ha hoa Iris (wikipedia)

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

1.1. Chun b d liu

Phn loi Iris dataset vi mng n ron a lp 4 u vo, 1 u ra. Chng ta m ha u ra ca mng nh bng 1 sau:
Table 1. Input and Output of a MLP NN
D liu
Iris setosa
Iris versicolor
Iris virginica

Output
0.0
0.5
1.0

ID

Sepal Length

Sepal Width

1
...
51
...
150

5.1
...
7
...
5.9

3.5
...
3.2
...
3

Petal
Length
1.4
...
4.7
...
5.1

Petal Width

Output

Species

0.2
...
1.4
...
1.8

0.0
...
0.5
...
1.0

setosa
...
versicolor
...
virginica

Phn loi Iris dataset vi mng n ron a lp 4 u vo, 3 u ra. Chng ta m ha u ra ca mng nh bng 2 sau:
Table 2. Input and Output of a MLP NN
D liu
Iris setosa
Iris versicolor
Iris virginica

Output 1
1.0
0.0
0.0

Output 2
0.0
1.0
0.0

Output 3
0.0
0.0
1.0

ID
1

51

150

Sepal
Length
5.1

5.9

Petal
Length
1.4

4.7

5.1

Petal
Width
0.2

1.4

1.8

Output 1

Output 2

Output 3

Species

1.0

0.0

0.0

0.0

1.0

0.0

0.0

0.0

1.0

setosa

versicolor

virginica

Sepal
Width
3.5

3.2

1
setosa
versicolor
virginica

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

Phn loi Iris dataset vi mng n ron (Self-Organizing Map). D liu cho SOM khng cn u ra nh d liu cho MLP NN. Do vy chng ra chun b
d liu nh hnh 2 sau:

ID
1

51

150

Sepal
Length
5.1

5.9

Sepal
Width
3.5

3.2

Petal
Length
1.4

4.7

5.1

Petal
Width
0.2

1.4

1.8

Species

u vo ca SOM

setosa

versicolor

virginica

Fig.2. Data for SOM


Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

1.2. Phn loi d liu vi SpiceSOM

Phn loi Iris flower vi SOM size8x10 neurons, ta c output map v output table tng ng nh hnh 3 sau. Ta thy ti n ron (0, 1) v n ron (0, 3) u
ng vi nhn ca hai loi d liu versicolor v virginica. Cn li cc n ron khc u ng vi mt nhn. Nh vy mng SOM phn loi nhm d liu
versicolor v virginica ti n ron (0, 1) v n ron (0, 3) v phn loi ng ti cc n ron cn li.
Trn bn ta cng thy d liu vi nhn setosa c phn bit hn v mt pha, cn d liu vi nhn versicolor v virginica nm gn nhau hn. Mt cch
trc quan ta c th thy hoa loi setosa c kch thc v hnh dng khc hn hai loi versicolor v virginica. V hai loi versicolor v virginica c kch
thc gn nh nhau v i khi ta khng phn bit c hai loi hoa ny nu ch da vo kch thc i hoa v cnh hoa.
No.

X0

X1

X2

X3

X4

X5

X6

X7

X8

X9

Y0

virg

vers;
virg

vers

vers;
virg

vers vers vers

seto seto

Y1

virg vers

vers

vers vers

seto seto

Y2

virg virg

vers vers

vers vers vers

seto seto

Y3

virg

virg vers

vers vers vers

seto seto

Y4

virg virg

virg virg

vers vers

seto

seto seto

Y5

virg virg

virg virg

vers vers

seto

seto seto

Y6

virg virg

virg vers

vers vers

seto

seto seto

Y7

virg virg

virg

vers vers

seto

seto seto

Fig.3. Output Map of SOM, trained by Spice-SOM with Iris Data


Label: virg = virginica vers = versicolor seto = setosa

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

1.3. Phn loi d liu vi SpiceNeuro

Phn loi Iris flower vi mng n ron MLP NN, ta c th ra ca mng nh hnh 4 sau.
Vi mng MLP 1 output, ta thy vi ngng 0.2 th mng phn loi ng 100% cho d liu nhn setosa. vi ngng 0.8 th c 1 dataset nhn versicolor
v 1 dataset nhn virginica b nhm.
Vi mng MLP 3 outputs, ta thy, vi ngng 0.5 th mng phn loi ng 100% cho d liu nhn setosa, 4 dataset nhn versicolor v 3 dataset nhn
vginica b nhm. Trong trng hp ny, mng MLP 3 outputs phn loi km chnh xc hn mng MLP 1 output.

Iris Data Prediction by Spice-Neuro M LP NeuralNetwork with 1 O utput

Iris Data Prediction by Spice-Neuro M LP NeuralNetwork with 3 Outputs

Prediction O utput
0 = setosa,0.5 = versicolor,1 = virginica

Prediction Output
O utput 1 = setosa,O utput 2 = versicolor,O utput 3 = virginica

1.4

1.4

1.2

1.2
1

0.8

0.8

O utput 1
O utput 2
O utput 3

0.6
0.6

Y0_from _NN
0.4

0.4

0.2

0.2

145

136

127

118

109

100

91

82

73

64

55

46

37

28

19

-0.2

10

-0.2

1
9
17
25
33
41
49
57
65
73
81
89
97
105
113
121
129
137
145

-0.4

Hnh 4. Output Graph of MLP with Iris Data

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

2. Students's Score Data


Gi s chng ta c bng im ca hc sinh trong mt lp. Bng mng n ron t t chc Spice-SOM, chng ta mun phn loi cc hc sinh li da vo
bng im.
2.1. Chun b d liu

Spice-SOM c th c c d liu, chng ta chun b d liu nh bng 3 sau. Lu phn tn c im trung bnh ca hc sinh c trong ngoc
(), bn c d hiu hn trong output map. Bn c c th tham kho d liu ny trong th mc Data ca chng trnh SpiceSOM.

Bng 3. D liu im hc sinh cho SOM


No

English

Algebra

Geome
try
7

99

100

Power
System
5

Management
Methodology
5

Geological
System
5

Pham Kieu Anh (6.0)

Cung Hong Hien (7.0)

Vo Mai Manh (6.4)

Pham La Trinh (6.0)

Analysis

9
7

Name

Cho mng SOM size 6x10 hc. Ta c bn ra (output map v output table) nh hnh 5 v 6. Trn ouput map, cc bn d dng nhn thy cc hc sinh c
cng thnh tch c xp gn nhau v hc sinh c thnh tch tt c xp xa hc sinh c thnh tch km.

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

Fig 5. Output Map of SOM, trained by Spice-SOM with Students' score Data

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

No.

X0

X1

X2

Y0

Ich_(8.7);Thieu_(8.8)
;Cong_(8.6);Dien_(8.
8);Nhung_(8.5)

Ha_(8.4);Mai
_(8.5);Thanh
_(8.7)

Nhan_(8.5);Hie
n_(8.5);Minh_(
8.7);Vinh_(8.5)

Y1

Hoa_(8.5);Linh_(8.6)
;Si_(8.7);Tung_(8.5)

Giang_(8.3)

Y2

Hung_(8.4);Hung_(8.
5);Ngan_(8.4)

Duong_(8.3)

Y3

Quang_(8.3);Thao_(8
.3);Trang_(8.3)

Y4

Hieu_(7.4)

Y5

Mai_(7.1);May_(7.1);
Thu_(7.3)

X3

X4

X5

X6

X7

X8

X9

Linh_(8.2)

Anh_(7.3
)

Nghia_(7.0);Han_(
7.2);Xuan_(7.2)

Truong_(6.5
)

Yen_(6.7);Co
ng_(6.8);Xuan
_(6.9)

Bay_(6.9);Me
n_(6.7)

Phuong_(6.6);Q
uy_(6.8);Thanh_
(6.5);Tuong_(6.8
)

Hoa_(7.5)

Minh_(7.9)

Nhi_(7.5)

Trang_(7.1);
Anh_(7.1);H
ue_(7.1)

Cong_(6.3);Xua
n_(6.3);Trinh_(6
.0)

Hien_(7.0)

Kim_(7.0
)

Han_(6.7);Hoang_
(6.9)

Tranh_(6
.9)

Loan_(6.6)

Lan_(7.2)

Dung_(6.9)

Linh_(6.3);Hung_(
6.5)

Huong_(7.2)

May_(7.0);Ma_(6.
9);Chi_(6.7);Han_
(6.6)

Thao_(6.6);Hieu_(
6.5);Lan_(6.5)

Dat_(6.1);Hoa
ng_(6.2)

Thieu_(6.1)

Manh_(6.4)

Thu_(6.0);Hien_
(6.0);Nghia_(6.0
);Lieu_(6.0);Lua
n_(6.0)

Thanh_(6.3);
Nhan_(6.2)

Anh_(6.0);Ha
n_(5.9);La_(6.
0);Thieu_(6.1)
;Thao_(6.0);
Minh_(5.8)

Phuong_(6.3)

Nga_(5.7);La
_(6.0)

My_(6.0);Nghia
_(5.8);Hong_(5.
7)

Minh_(6.0);T
hanh_(6.0);Ye
n_(5.9)

Xuan_(6.0);Sang
_(5.8);Nga_(6.1)
;Ly_(5.6)

Fig. 6. Output Table of SOM, trained by Spice-SOM with Students' score Data

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

3. Face/Nonface Classification
Cc bn hc nhn dng v x l nh bit pht hin khun mt, phng php thng dng hin nay l dng Haar-like feature + Adaboost Algorithm.
Dng SOM v MLP NN cng c chnh xc cao nhng tc nhn dng chm. V d y dng MLP NN v SOM phn loi cc frame c cha
khun mt, vi mc ch minh ha cch s dng MLP NN, SOM v Output Map ca SOM.
3.1. Chun b d liu

di ca cc vector d liu ph thuc vo feature m cc bn s dng. Gi s ta c mt tp nh mu kch thc m x n. Nu dng pixel value lm feature
(vector biu din nh), ta s c mt vector di m x n cho mi nh. Nu dng pixel value histogram lm feature, ta s c mt vector di 256 cho mi nh.
Nu dng cc fearture khc chng hn nh Histogram of Oriented Gradient, vector biu din s ph thuc vo tham s m cc bn chn khi to feature.
Ti liu ny khng cp n cc feature trong nhn dng nh.
phn loi nh bng MLP, ta cn m ha u ra yu cu (desired output), chng hn ta s dng 1 u ra vi gi tr 1.0 l face, 0.0 l non face. phn
loi nh bng SOM, ta ch cn vector biu din nh v nhn, v d nh sau:

ID
1
2

n
n+1

feature vector

Output
1.0
1.0

0.0
0.0

Fig.7. Data for MLP NN

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

label
face
face

nonfacce
nonfacce

ID
1
2

n
n+1

feature vector

label
face
face

nonfacce
nonfacce

Fig.8. Data for SOM

10

3.2. Phn loi nh face-nonface bng Spice-SOM

Cc hnh 9 v 10 sau minh ha Output Maps ca SOM qua mt s ln hc vi cc kch thc SOM Size khc nhau, training vi 400 nh mu khun mt
download t http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html v 700 nh mu khng phi khun mt c ly ngu nhin t internet,
324 histogram of gradient inputs. Trong hnh minh ha, mi neuron ch hin th mt nh u tin ng vi n (trong thc t c th c nhiu nh ng vi
mt neuron v c neuron khng c nh no tng ng).

Fig. 9. Output Map of a SOM

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

11

Fig.10. Output Map of a SOM

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

12

3.3. Phn loi nh face-nonface bng Spice-MLP

Hnh 11 minh ha li trong qu trnh hc, gi tr u ra ca mng n ron (Actual Output from NN) v gi tr ra yu cu (Desired Output), vi 1100 dataset
(400 faces + 700 nonfaces), 324 histogram of gradient inputs, 20 hidden and 1 output neurons, Hyperbolic Tangent Activated Funtion, 605/1100 datasets
for training (55%) v 495/1100 data set for testing (45%).

Face Recognition by Spice-M LP


324 histogram ofgradient inputs
20 hidden and 1 output neurons
Hyperbolic Tangent Activated Funtion
605/1100 datasets for training
495/1100 data set for testing

Desired Output
ActualO utput from NN

Face R ecognition by Spice-M LP


TrainingError
TestingError

Errors in 1000 Iterations


0.06

1
0.05

0.8
0.04

0.6
0.03

0.4

0.02

0.2

0.01

951

901

851

801

751

701

651

601

551

501

451

401

351

1200

301

1000

251

800

201

600

151

400

51

200

101

0
-0.2

Fig.11. Outputs and Training Errors of a MLP with face/non-face data

Hnh 12 sau minh ha ng c tnh ROC (Receiver Operating Characteristic ROC) v chnh xc vi cc ngng khc nhau trn tp d liu kim
tra.
Ta thy vi ngng phn bit face/nonface l 0.52, th mng MLP phn loi ng 98.7% vi tp d liu kim tra ny.

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

13

True Positive Rate

Accuracy

Face C lassification by SpiceM LP


R O C after 40 Iterations
Activated Function = HyperTanh

Face C lassification by SpiceM LP


A ccuracy w ith D ifferent Thresold
Activated Function = HyperTanh

0.95

0.9
0.9
Thresold = 0.52
Accuracy = 0.987

0.8
0.85

0.7

0.8
0

0.05

0.1

0.15

0.2

0.2

0.4

False Positive Rate

0.6

0.8

1
Thresold

Fig.12. ROC and Accuracy Lines by a MLP with face/nonface data

4. Phn loi nh ngi i b


4.1. Chun b d liu

D liu minh ha y c ly t 924 nh ngi i b download t http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html v 1100 nh


khng phi ngi i b c ly ngu nhin t internet.
4.2. Phn loi nh ngi i b bng SpiceSom

Cc hnh 13 v 14 trang sau minh ha Output Maps ca SOM qua mt s ln hc vi cc kch thc SOM Size khc nhau, training vi 924 nh ngi i
b v 1100 nh khng phi ngi i b, 810 histogram of gradient inputs.

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

14

Fig.13. Output Map of a SOM with Pedestrian Data

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

15

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

16
Fig.14. Output Map of a SOM with Pedestrian Data

4.3. Phn loi nh ngi i b bng SpiceMLP

Hnh 15 minh ha li trong qu trnh hc, gi tr u ra ca mng n ron (Actual Output from NN) v gi tr ra yu cu (Desired Output), vi 2024 dataset
(924 pedestrial + 1100 non-pedestrial), 810 histogram of gradient inputs, 5 hidden and 1 output neurons, Hyperbolic Tangent Activated Funtion,
1113/2024 datasets for training (55%) v 911/2024 data set for testing (45%).
P edestrial R ecognition by Spice-M LP
2024 histogram ofgradient inputs
5 hidden and 1 output neurons
Hyperbolic Tangent Activated Funtion
1113/2024 datasets for training
911/2024 data setfortesting
1.4
1.2
1
0.8
0.6
0.4
0.2
0
-0.2 0
-0.4
-0.6

PedestrialRecognition by Spice-M LP
Errors in 1000 Iterations

ActualO utput from NN


Desired O utput

TrainingError
TestingError

0.06
0.05
0.04
0.03
0.02
0.01

500

1000

1500

2000
0
1 51 101 151 201 251 301 351 401 451 501 551 601 651 701 751 801 851 901 951

Fig.15. Outputs and Training Errors of a MLP with Pedestrial Data

Cc hnh nh trong hnh 16 sau minh ha ng c tnh ROC trn tp d liu kim tra khi mng MLP va khi to (cha train hay 0 iteration) v sau khi
o to qua 1, 2, 3, 10, 20 iterations. Ta thy khi mng MLP va khi to, ng ROC ging nh ng ROC ca php chn ngu nhin. Mng hi t kh
nhanh sau mt s t ln lp (iterations).
Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

17

True Positive Rate

True Positive Rate

Pedestrian Classification by SpiceMLP


ROC after MLP initialization (0 Iteration)

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0
0

0.2

0.4

0.6

0.8

Pedestrian Classification by SpiceMLP


ROC after 1 Iteration

0
0

0.2

0.4

0.6

True Positive Rate

True Positive Rate

Pedestrian Classification by SpiceMLP


ROC after 2 Iterations

0.8

False Positive Rate

False Positive Rate

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

Pedestrian Classification by SpiceMLP


ROC after 3 Iterations

0
0

0.2

0.4

0.6

0.8

0.2

0.4

False Positive Rate

0.6

0.8

False Positive Rate

Fig.16. ROC after several training iteration (continued in next page)

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

18

True Positive Rate

True Positive Rate

Pedestrian Classification by SpiceMLP


ROC after 10 Iterations

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

Pedestrian Classification by SpiceMLP


ROC after 20 Iterations

0
0

0.2

0.4

0.6

0.8

0.2

0.4

0.6

False Positive Rate

0.8

False Positive Rate

Fig.16. ROC after several training iteration (continued)

Hnh 17 sau minh ha ng c tnh ROC v chnh xc vi cc ngng khc nhau trn tp d liu kim tra sau 30 iterations. Vi ngng phn bit
pedestrial/non-pedestrial l 0.75, th mng MLP phn loi ng 98.2% vi tp d liu kim tra ny.
True Positive Rate

Accuracy

Pedestrian Classification by SpiceMLP


ROC after 30 Iterations (detailed)
Activated Function = Sigmoid

Pedestrian Classification by SpiceMLP


Accuracy with Different Thresold
Activated Function = Sigmoid

0.98

0.95

0.96
0.94

0.9

0.92

Thresold = 0.75
Accuracy = 0.982

0.9

0.85

0.88
0.8

0.86
0

0.05

0.1

0.15

0.2

0.2

False Positive Rate

0.4

0.6

0.8

1
Thresold

Fig.17. ROC and Accuracy Lines by a MLP with pedestrial data

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

19

Fig.18.a. Mt s nh phn loi nhm non-pedestrial -> pedestrial vi ngng phn bit pedestrial/non-pedestrial l 0.5

Fig. 18.b. Mt s nh phn loi nhm pedestrian -> non-pedestrian vi ngng phn bit pedestrian/non-pedestrian l 0.5

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

20

5. Phn loi nh xe hi
Tng t nh phn loi nh khun mt v ngi i b, cc hnh sau minh ha Output Maps ca SOM qua mt s ln hc vi cc kch thc SOM Size
khc nhau, training vi cc nh mu xe hi download t http://cogcomp.cs.illinois.edu/Data/Car/

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

21
Fig.19.a. Output Map of a SOM with Car Data

Fig.19.b. Output Map of a SOM with Car Data

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

22

6. D bo chng khon
Mt trong nhng ng dng th v ca mng n ron l d bo chng khon. Da vo cc s liu thng k c sn ca th trng, mng n ron c th d bo
kh chnh xc v gi chng khon trong nhng ngy tip theo.
d bo chng khon bng mng n ron c chnh xc, vic quan trng nht l tm cc d liu thch hp ca th trng bao gm gi chng khon cho
ti cc thng tin kinh t v m, vi m, m ha cc thng tin mt cch hp l mng n ron c th hc v tng qut ha c.
C nhiu phng php d bo chng khon bng mng n ron. Ti liu ny trnh by vi cc bn phng php n gin nht l da vo gi trong thi
gian qua d bo gi trong thi gian ti.
6.1. Chun b d liu

Bn c th download c gi ca NASDAQ Stock Data ti http://www.dailyfinance.com/historical-stock-prices/ V d t ngy 3_12_1990 ti ngy


1_12_2010, d liu c dng nh bng 4 sau:
Table 4. NASDAQ Stock Price
High
362.2798
364.2898
370.9700
378.0298
372.3499

2545.4100
2541.4900
2531.0300
2510.7100
2558.2900

Low
359.0498
360.4199
364.0198
370.9199
369.0198

2515.4800
2522.4000
2496.8300
2488.6100
2534.8100

Open
361.3198
364.1099
370.8699
372.2898
371.5398

2519.8900
2525.9100
2522.2400
2497.1200
2535.1900

Close
361.3198
364.1099
370.8699
372.2898
371.5398

2543.1200
2534.5600
2525.2200
2498.2300
2549.4300

Day_Month_Year
3_12_1990
4_12_1990
5_12_1990
6_12_1990
7_12_1990

24_11_2010
26_11_2010
29_11_2010
30_11_2010
1_12_2010

T d liu ny, bn mun d bo gi ca th trng trong ngy tip theo. Chng hn nh bn mun d on gi trung bnh (Open + Close)/2.0 ca ngy
ti da vo gi trung bnh ca nhng ngy qua.
D liu bn c hin ti l d liu dng time series, mng n ron MLP c th hc c, bn cn chun b li nh sau.
Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

23

T cc ct d liu gi Open v Close m bn ang c, bn to ct mi c gi tr trung bnh ca 2 ct d liu trn.


Bn mun s dng gi trung bnh ca 15 ngy qua d on gi ca 1 ngy ti, th trng NASDAQ. Bn to mt hng (row) gm 16 d liu, 15 d
liu u ca hng ny l gi ca 15 ngy lin tip, d liu th 16 (cui cng) l gi ca ngy tip theo ngy th 15 ni trn. Nh vy bn s s dng mng
n ron c 15 u vo v 1 u ra. D liu ca bn c dng nh bng 5 sau:
Table 5. Gi ca NASDAQ Stock, c chun b li mng MLP NN c th hc c
Today - 14
361.3198
364.1099
370.8699
372.2898

2573.305
2578.305
2575.455
2575.03

Today - 13
364.1099
370.8699
372.2898
371.5398

2578.305
2575.455
2575.03
2571.545

Today - 12
370.8699
372.2898
371.5398
371.47

2575.455
2575.03
2571.545
2544.88

Today - 2
371.22
372.2998
373.5999
372.4099

2520.705
2499.58
2531.505
2530.235

Today - 1
372.2998
373.5999
372.4099
372.3999

2499.58
2531.505
2530.235
2523.73

Today
373.5999
372.4099
372.3999
371.0498

2531.505
2530.235
2523.73
2497.675

Tomorrow
372.4099
372.3999
371.0498
371.2

2530.235
2523.73
2497.675
2542.31

Label
21_12_1990
24_12_1990
26_12_1990
27_12_1990

24_11_2010
26_11_2010
29_11_2010
30_11_2010

Tng t, nu bn mun s dng gi ca 20 ngy qua d on gi ca 3 ngy ti, bn to d liu nh trn v s s dng mng n ron 20 u vo v
3 u ra, ...
V d sau s dng gi trung bnh ca 15 ngy qua d on gi trung bnh ca ca 1 ngy ti, tc l 15 u vo, 1 u ra, t ngy 21_12_1990 ti
ngy 30_11_2010. S dng chng trnh SpiceMLP d on. S Datasets l 5026. y 80% d liu (4021) datasets c dng hc, 20% d liu
(1005) datasets c dng kim tra. S ln lp khi hc l 1000.
Number of trained data: 4021.

Number of tested data: 1005

Taken iterations: 1000

Number of Inputs: 15

Number of Outputs: 1

D liu ny c trong th mc Data khi cc bn ci t chng trnh SpiceMLP (hay cn gi l Spice Neuro).

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

24

6.2. Tm cc thng s thch hp cho mng n ron

mng n ron d bo tt, cn chn cc thng s thch hp cho mng. Thng s thch hp thng ph thuc nhiu vo d liu ca bn, mt thng s c
th tt cho d liu ny nhng li km khi s dng d liu khc. y gii thiu vi cc bn phng php n gin nht: vi cng d liu hc v kim
tra, thay i mt thng s tm gi tr ti u tng i. Lu , trc khi o to mng, bn cn chun ha d liu vo v ra. V d y dng hm Linear
chun ha. Trc ht, ta tm s n ron lp n sao cho (c v) hp l nht.
Table 6. Li khi s dng cng hm bin i HyperTanh cho lp n v lp ra, thay i s n ron n.
Number of Hidden
Neurons
15
12
10
8
6
4
3
2

Errors
Training
9.01E-05
9.22E-05
8.62E-05
9.01E-05
6.54E-05
6.49E-05
4.43E-05
4.28E-05

Testing
9.35E-05
1.12E-04
1.04E-04
9.36E-05
6.94E-05
6.65E-05
4.25E-05
4.01E-05

Cc bn thy, s n ron lp n l 2 c v tt hn. Tip theo, ta tm hm bin i ca lp ra.

Table 7. Li khi s dng cng hm bin i HyperTanh cho lp n, s n ron n l 2, thay i hm bin i ca lp ra.
Activated Function
Hidden Layer
Output Layer
HyperTanh
Identity
HyperTanh
Sigmoid
HyperTanh
ArcTan
HyperTanh
ArcSinh
HyperTanh
Sin
HyperTanh
Gaussian
HyperTanh
XSinX
Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

Errors
Training
3.94E-05
7.35E-05
6.97E-05
6.24E-05
5.05E-05
5.48E-05
5.54E-05

Testing
3.38E-05
1.08E-04
6.41E-05
6.47E-05
4.83E-05
5.77E-05
5.53E-05

25

Cc bn thy, hm bin i lp ra l Identity c v tt hn. V sau cng, ta tm hm bin i cho lp n.


Bng 8. Li khi s dng cng hm bin i Identity cho lp ra, s n ron n l 2, thay i hm bin i ca lp n.
Activated Function
Hidden Layer
Output Layer
Sigmoid
Identity
HyperTanh
Identity
ArcTan
Identity
ArcSinh
Identity
Sin
Identity
Gaussian
Identity
XsinX
Identity

Errors
Training
4.33E-05
3.94E-05
4.03E-05
4.05E-05
4.07E-05
4.43E-05
4.72E-05

Testing
3.93E-05
3.38E-05
4.37E-05
3.99E-05
4.18E-05
4.25E-05
4.98E-05

Cc bn thy, hm bin i lp n l HyperTanh c v tt hn.


Nh vy ta s chn s n ron lp n l 2, hm bin i lp ra l Identity, hm bin i lp n l HyperTanh.
6.3. o to mng

Tin hnh o to mng vi ln v chn ln o to c li training error v testing error nh nht. Thng tin v mng hc v th li ca bn s c dng
sau.
Thng tin ca ln hc cui cng
Hm bin i cho lp n: HyperTanh
Hm bin i cho lp ra: Identity
T l hc cui cng: 0.03308719
Gi tr MSE ca D liu hc: 4.138118E-05
Gi tr MSE ca D liu kim tra: 3.423549E-05
S lng d liu hc: 4021
S lng d liu kim tra: 1005
S ln lp: 1000

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

26

TrainingError

Error in Training

TestingError
0.0004
0.0003

0.0002
0.0001
0
1

201

401

601

801

Fig. 20. Training Error when learning NASDAQ Stock prices

6.4. Kim tra d liu c m hnh ha (modeling)

Sau khi mng hc xong, kim tra d liu hc trong phn Xem d liu, u ra ca d liu hc (training data) do mng MLP a ra (NN Outputs) c dng
nh hnh 21 sau:

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

27

N ASD AQ S tock Data M odeling by SpiceM LP


M odeling of Training Data
Num ber of trained data:4021

Desired O utputs
NN Outputs

0.8
0.6

0.4

0.2
0

21 Dec 1990

30 Nov 2010
Fig. 21. Outputs of Training Data (NASDAQ Stock prices)

Ta thy vi d liu hc, u ra ca mng gn trng khp vi u ra yu cu (tc l u ra thc ca d liu hc).
Vi d liu kim tra (testing data), bn c th dng hnh 22 sau:

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

28

NASDAQ Stock Data M odeling by SpiceM LP


M odeling ofTesting Data
Num ber oftested data:1005

Desired Outputs
NN Outputs

1
0.8
0.6
0.4
0.2
0

3 Jan 1991

8 M arch 2000

30 Nov 2010

Fig. 22.a. Outputs of Testing Data (NASDAQ Stock prices)

Xem trong khong thi gian ngn, bn c th nh hnh trang sau.


Vi d liu kim tra, u ra ca mng cng xp x u ra yu cu (tc l u ra thc ca d liu hc). Bn d nhn thy mng n ron MLP hc kh tt. Tuy
nhin ti mt s im vn cn li nh.
Cu hi dnh cho bn c
Lm th no gim li trong d bo bng mng MLP?
Liu cn c th dng nhng d liu khc d bo chng khon?
Liu c th p dng phng php trn cho cc bi ton time series khc?
Bi ton khng phi time series th chun b d liu nh th no?
v d trn s dng gi trung bnh ca 15 ngy qua d on gi trung bnh ca ca 1 ngy ti. Lm th no s dng gi trung bnh ca
Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

29

30 ngy qua d on 3 gi trung bnh ca ca 3 ngy ti?


Vi d liu cho n hm nay, lm th no d on gi ca ngy mai?

NASDAQ Stock Data M odeling by SpiceM LP


M odeling ofTesting Data

Desired O utputs
NN O utputs

0.8
0.6
0.4
0.2
0

7 O ct 1998

8 M arch 2000

14 Dec 2000

Fig. 22.b. Outputs of Testing Data (NASDAQ Stock prices)

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

30

7. D bo t gi
Gi s bn mun d bo t gi Canada Dollar/US Dollar v Canada Dollar/Japanese Yen. Bn c th download c d liu ny ti
http://www.bankofcanada.ca/rates/exchange/10-year-lookup/
T gi Canada Dollar/US Dollar v Canada Dollar/Japanese Yen download c t a ch trn c dng nh sau
USD->CAD

CAD->USD

Date

JPY->CAD

CAD->JPY

Date

1.5657

0.6387

2001/10/12

0.012948

77.232005

2001/10/12

1.5579

0.6419

2001/10/15

0.012883

77.621672

2001/10/15

1.5619

0.6402

2001/10/16

0.01288

77.639752

2001/10/16

1.5981

0.6257

2001/11/8

0.01331

75.13148

2001/11/8

1.6021

0.6242

2001/11/9

0.013325

75.046904

2001/11/9

Bank holiday

Bank holiday

2001/11/12

Bank holiday

Bank holiday

2001/11/12

1.5981

0.6257

2001/11/13

0.013158

75.999392

2001/11/13

1.5916

0.6283

2001/11/14

0.013082

76.440911

2001/11/14

1.5868

0.6302

2001/11/15

0.012961

77.154541

2001/11/15

Gi s bn dng hai ct d liu CAD->USD v CAD->JPY, dng d liu ca 30 ngy qua d bo t gi ca ngy th 5 ti. Ngha l ngy hin ti
ca bn l Today, bn dng d liu Today-30, Today-29, ..., Today-1,Today d bo t gi ca Today+5. Nh vy bn s c 60 u vo (30 u vo cho
CAD->USD v 30 u vo cho CAD->JPY), 2 u ra cho CAD->USD v CAD->JPY. V d sau dng d liu ca 15 ngy qua d bo t gi ca
ngy th 5 ti vi hai t gi CAD->USD v CAD->JPY. Ngha l bn s c 30 u vo, 2 u ra.
Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

31

Vi d liu nh trn, bn b cc dng "Bank Holiday" i, sau chun b d liu nh sau


CAD->USD
ID

CAD->JPY

Today-14 Today-13 Today-1

0.6387

0.6419

0.6302

Today

Today-14

Today-13

0.6285 77.232005 77.621672

CAD->USD CAD->JPY

Today-1

Today

77.23797 76.581406

LABEL

Today+5

Today+5

Today Date

0.6257

75.13148

2001/11/1

Bn ch thy CAD->USD c gi tr trong [0.6199,1.0905] v CAD->JPY c gi tr trong [68.918, 123.885]. V hai khong gi tr khc nhau kh nhiu,
nu bn chun ha d liu mt ln vi tt c d liu, CAD->USD s c gi tr nh v CAD->JPY c gi tr ln. Nh vy mng MLP hc s khng tt.
mng MLP hc tt, bn nn chun ha ring CAD->USD v CAD->JPY. Trong th mc Data ca chng trnh Spice-MLP c cha d liu cha
chun ha (CAD_USD_JPN_2489_data_30inputs_2outputs.csv) v d liu c chun ha theo phng php Linear
(CAD_USD_JPN_Normalized_2489_data_30inputs_2outputs.csv), vi 30 u vo, 2 u ra v s d liu l 2489, t ngy 2001/11/1 ti ngy 2011/10/3.
Chn ngu nhin 70% d liu (1742 datasets) lm d liu hc (training data) v 30% d liu (747 datasets) lm d liu kim tra (testing data). Cho mng
MLP hc, ta c thng tin nh sau:
Activated Function for Hidden Layer: HyperTanh
Activated Function for Output Layer: Linear
Final Learning rate: 0.003811921
Final MSE of Training Set: 0.0004247656
Final MSE of Testing Set: 0.0004671731
Number of trained data: 1742
Number of tested data: 747
Taken iterations: 1000

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

32

Lu d liu hc bi mng MLP vo file csv v v th, ta c th u ra ca mng nh hnh 23 sau.

Currency Exchange Modeling by Spice-MLP, Training Data

Y0_Desired

Y0: CAD -> USD


Y1: CAD -> JPN

Y0_from_NN
Y1_Desired
Y1_from_NN

1
0.8
0.6
0.4
0.2
0
1

251

501

751

1001

1251

1501

Fig. 23.a. Outputs of Training Data (CAD->USD and CAD->JPY)

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

33

Currency Exchange Modeling by Spice-MLP, Testing Data

Y0_Desired

Y0: CAD -> USD


Y1: CAD -> JPN

Y1_Desired

Y0_from_NN
Y1_from_NN

1
0.8
0.6
0.4
0.2
0
1

101

201

301

401

501

601

701

Fig. 23.b. Outputs of Testing Data (CAD->USD and CAD->JPY)

Theo th trn, ta thy ni chung mng MLP hc tt vi c hai d liu CAD->USD v CAD->JPY, tuy nhin ti mt s im u ra ca mng hi lch
so vi u ra yu cu.
Cu hi dnh cho bn c:
iu chnh d liu v cc thng s ca mng mng MLP hc c vi chnh xc m bn mong mun.
V d trn dng d liu ca 15 ngy qua d bo t gi ca ngy th 5 ti. Vy mun dng d liu ca 60 ngy qua d bo t gi ca
ngy th 10 v ngy th 15 ti th lm th no?
Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

34

8. D bo lu lng nc h Ha Bnh

Da vo d liu v lu lng nc v h thy in trong qu kh, ta c th d bo lu lng nc v h trong tng lai gn. Trong th mc Data ca
chng trnh Spice-MLP c file d liu Hoabinh_water_level_3_input_1_output.csv. y l file d liu v d bo lu lng nc tng lai trc 10
ngy Q(t+10) ca h Ha Bnh da vo cc lu lng nc ti thi im hin ti v qu kh. D liu c 3 u vo gm lu lng nc hin ti Q(t), lu
lng nc trc 10 ngy Q(t-10) v lu lng nc trc 20 ngy Q(t-20). S d liu l 570, trong 480 mu hc (t line 2 ti line 481) v 90
mu kim tra (t line 482 ti line 571). D liu ny do bn Phm Th Hong Nhung, trng H Thy li cung cp, bn c c th tham kho lun vn
Master ca Phm Th Hong Nhung (1997) v " kho st mt s phng php hc my tin tin, thc hin vic kt hp gia phng php hc my mng
neuron vi thut ton gene v ng dng vo bi ton d bo lu lng nc n h Ha Bnh". Xin cm n bn Phm Th Hong Nhung cho php s
dng d liu lu lng nc h Ha Bnh minh ha trong ti liu ny.
Thng tin vn tt v nh my thy in Ha bnh trn wiki nh sau: Nh my Thy in Ho Bnh c xy dng ti h Ha Bnh, tnh Ha Bnh, trn
dng sng thuc min bc Vit Nam. Cho n nay y l cng trnh thy in ln nht Vit Nam v ng Nam . Nh my do Lin X gip xy
dng v vn hnh. Cng trnh khi cng xy dng ngy 6 thng 11 nm 1979, khnh thnh ngy 20 thng 12 nm 1994. Cng sut sn sinh in nng theo
thit k l 1.920 megawatt, gm 8 t my, mi t my c cng sut 240.000 kilowatt. Sn lng in hng nm l 8,16 t kilowatt gi (KWh).
nh v h thy in trn internet nh sau:

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

35

Vi d liu Hoabinh_water_level_3_input_1_output.csv, cc bn load d liu nh Fig.24, chun ha d liu theo phng php Linear, chia d liu hc v
kim tra nh Fig.25.

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

36

Fig. 24. Load Data with Hoabinh_water_level_3_input_1_output.csv

Fig. 25. Split Data for training and testing


Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

37

Discharge Hydrograph Of Hoabinh Lake


Modeling by Spice-MLP, Training Data
Activated Functions: Sigmoid;
Taken iterations: 1000

Y0_Desired
Y0_from_NN

1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1

20 39 58 77 96 115 134 153 172 191 210 229 248 267 286 305 324 343 362 381 400 419 438 457 476

Fig. 26. Discharge Hydrograph Of Hoabinh Lake, Training Data

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

38

Discharge Hydrograph Of Hoabinh Lake


Modeling by Spice-MLP, Testing Data
Activated Functions: Sigmoid;
Taken iterations: 1000

Y0_Desired
Y0_from_NN

0.6
0.5
0.4
0.3
0.2
0.1
0
1

7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88

Fig. 27. Discharge Hydrograph Of Hoabinh Lake, Testing Data

Chn hm activated functions cho lp n v lp ra, cho mng hc, sau mt s iteration, lu d liu vo mt file csv, bn s c th u ra ca mng vi
d liu hc nh Fig.26, d liu kim tra nh Fig.27. v thng tin v mng hc nh Fig.28.
Chng ta thy mng hc kh tt, tuy nhin c mt s im u ra ca mng v u ra mong mun c lch kh ln. Lm th no lm gim lch
ny? xin mi bn c nghin cu. Chc cc bn may mn.

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

39

SPICE-NEURO by Cao Thang 2004-2011


Last trained information;
Activated Function for Hidden Layer: Sigmoid;
Activated Function for Output Layer: Sigmoid;
Final Learning rate: 0.03572268;
Final MSE of Training Set: 0.003680485;
Final MSE of Testing Set: 0.003794955;
Number of trained data: 480;
Number of tested data: 90;
Taken iterations: 1000;

Fig. 28. Discharge Hydrograph Of Hoabinh Lake modeling by Spice-MLP, Training Information

9. Kt lun
Ti liu ny hng dn cc bn cch s dng mng n ron trong cc ng dng thc t. Tc gi hy vng n c ch cc bn.
Cm n cc bn c. Chc cc bn may mn trong hc tp, cng vic v enjoy cuc sng.

Cao Thng, Mt s v d phn loi dng SOM v MLP Neural Network

40

You might also like