Professional Documents
Culture Documents
August, 12th
1 Classification:
1 Predict a house expensive or not.
2 Brain states decoding: lie detector
3 Automatic disease diagnosis: early detection of Alzheimers disease
2 Regression:
1 Predict the house price
2 Predict cochlear implantation (CI) outcomes
3 Predict the reading gains in children with dyslexia
where x (i) is called feature vector, the superscript (i) indicates the
i-th training sample, x (i) is a m dimensional vector
(i) (i) (i)
x (i) = [x1 , xj , , xm ] (2)
(i)
xj is the j-th feature/independent variable for the i-th subject. y (i)
is called label/dependent variable.
3 Given the training set, SVM is able to learn the rules to predict y .
1 Download: http://www.csie.ntu.edu.tw/~cjlin/libsvm/
oldfiles/index-1.0.html
2 Documents: http://www.csie.ntu.edu.tw/~cjlin/libsvm/
3 Unzip
4 Move to the folder where you want to install the libsvm, e.g.
C:\users\MATLAB\tools
5 Set path
How to train a SVM model? How to use the trained model to predict new
samples whose true label is not known?
1 clc
2 clear
3 load data.mat;
4 m=min(data);
5 n=max(data);
6 data=(datarepmat(m,size(data,1),1))./repmat(nm,size(data,1),1);
7 test=(testm)./(nm);
8 plot(data(1:10,1),data(1:10,2),'o','MarkerSize',12);
9 hold on
10 plot(data(11:21,1),data(11:21,2),'r*','MarkerSize',12);
11 plot(test(1),test(2),'k*','MarkerSize',12);
12 model = svmtrain(label, data, 's 0 t 0 c 10');
13 [predicted label, accuracy, decision score] = svmpredict(label, data, model);
14 w = model.SVs' * model.sv coef;
15 x0=[0,1];
16 y0=zeros(1,2);
17 y0(1)=(model.rhox0(1)*w(1))/w(2);
18 y0(2)=(model.rhox0(2)*w(1))/w(2);
19 plot(x0,y0,'k');
20 axis([0,1,0,1.2])
Basically C-SVC and nu-SVC are the same thing but with different
parameters. You can choose either one.
Black boundary: -s 0 -t 0 -c 10
Green boundary: -s 1 -t 0 -nu 0.7
You may choose linear kernel, when the number of features is large and
the number of training samples is small. Another advantage of the linear
kernel is that the model is easy to interpret.
Lirong Tan (CCHMC) PNRC Neuroimaging Training Course 2013-8-12 10 / 38
parameter -t
Radial basis function kernel (RBF):
1 Another parameter needs to be set, . If is too small, the model
may be underfitted. If is too large, it may be overfitted.
2 Choose RBF kernel, when the number of features is small and the
number of training samples is large.
3 Make sure you have performed feature scaling before you use RBF
kernel. Otherwise, exp( |u v |2 ) will be dominated by features
with large range.
Lirong Tan (CCHMC) PNRC Neuroimaging Training Course 2013-8-12 11 / 38
parameter -t
Other suggestions:
1 Recent research shows that if RBF is used with model selection, then
there is no need to consider the linear kernel.
2 People tend to not use the polynomial and sigmoid kernel that much.
3 For libsvm, you may define your own kernel functions, and feed the
precomputed kernel matrix to SVM.
1 Objective Function:
N
X M
X
C [yi cost1 + (1 yi ) cost0 ] + wj2 (4)
i=1 j=1
2 The first term is actually to pernalize the samples that have been
classified wrongly.
Use samples with true labels to assess the performance of the model.
TP
sensitivity = (5)
TP + FN
TN
specificity = (6)
TN + FP
TP + TN
accuracy = (7)
TP + TN + FP + FN
2 AUC evaluates how the model ranks the samples. For example, if the
predicted scores for negative samples are always smaller than the
predicted scores of positive samples, this model tends to have a high
AUC.
1 [X,Y,THRE,AUC]=perfcurve(labels,scores,posclass);
2 plot(X,Y,'.')
3 xlabel('False Positive Rate (1specificity)');
4 ylabel('True Positive Rate (sensitivity)');
5 title('AUC=x');
procedures:
1 Train models with different parameters (c = 22 , 21 , 20 , 21 , 22 )
2 Test the models on the cross-validation set, and pick the model with
the highest AUC/accuracy
3 Test the picked model on the testing set, and report the performance
Try all the combinations defined on the grid. Choose the model that
gives out the best performance on the cross-validation set.
Lirong Tan (CCHMC) PNRC Neuroimaging Training Course 2013-8-12 20 / 38
Model Selection & Evaluation
The model is assessed based on its performance on all the testing samples
across Fold 1, 2, and 3.
2 The data set contains 3 classes of 50 instances each, where each class
refers to a type of iris plant.
3 Attribute information:
1 sepal length in cm
2 sepal width in cm
3 petal length in cm
4 petal width in cm
1 clc
2 clear
3 load iris.mat;
4 features=features(logical(label6=1),:);
5 label=label(logical(label6=1));
6 label(logical(label==2))=1;
7 label(logical(label==3))=0;
8 sampleN=length(label);
9 score=zeros(sampleN,1);
10 cs=10:10;
11 for i=1:sampleN
12 train=features([1:i1,i+1:end],:);
13 test=features(i,:);
14 [train,test]=func scale(train,test);
15 trainLabel=label([1:i1,i+1:end]);
16 testLabel=label(i);
17 AUCs=zeros(length(cs),1);
18 for j=1:length(cs)
19 AUCs(j)=func LOOCV(train,trainLabel,2(cs(j)));
20 end
21 [tmp,j]=max(AUCs);
22 model = svmtrain(trainLabel, train, ['s 0 t 0 c ',num2str(2(cs(j)))]);
23 [predicted label, accuracy, score(i)] = svmpredict(testLabel, test, model);
24 end
25 [X,Y,THRE,AUC]=perfcurve(label,score,1);
26 plot(X,Y,'.');
1 clc
2 clear
3 load iris.mat;
4 cvidx = [crossvalind('Kfold', 50, 10);crossvalind('Kfold', 50, ...
10);crossvalind('Kfold', 50, 10)];
5 trueLabel=[];
6 predictedLabel=[];
7 for i=1:10
8 train=features(logical(cvidx6=i),:);
9 test=features(logical(cvidx==i),:);
10 trainLabel=label(logical(cvidx6=i));
11 testLabel=label(logical(cvidx==i));
12 trueLabel=[trueLabel;testLabel];
13 [train,test]=func scale(train,test);
14 model = svmtrain(trainLabel, train, 's 0 t 0 c 1');
15 [predicted label, accuracy, descion score] = svmpredict(testLabel, test, model);
16 predictedLabel=[predictedLabel;predicted label];
17 end
18 accuracy=sum(predictedLabel==trueLabel)/length(trueLabel);
In linear SVM, we can use the weights to measure the importance of the
features.
Lets start from the binary class and assume you have two labels 0 and 1.
After obtaining the model from calling svmtrain, do the following to have
w and b:
The larger the absolute value of w(i), the more important the i-th feature.
Model: y = w T x + b
1 clc
2 clear
3 N=1000;
4 x=randn(N,1);
5 y=2*x+randn(N,1)+3;
6 Mi=max(x);
7 mi=min(x);
8 x=(xmi)./(Mimi);
9 plot(x,y,'.')
10 model=svmtrain(y,x,'s 3 t 0 c 4');
11 [predicted label, accuracy, descion score] = svmpredict(y, x, model);
12 w = model.SVs' * model.sv coef;
13 b=model.rho;
14 hold on
15 plot([min(x),max(x)],[min(x)*w+b,max(x)*w+b],'r','LineWidth',2);
16 xlabel('x');
17 ylabel('y');
18 title('SVR linear model');
1 clc
2 clear
3 N=1000;
4 x=randn(N,1);
5 y=(x0.5).2+randn(N,1);
6 Mi=max(x);
7 mi=min(x);
8 x=(xmi)./(Mimi);
9 plot(x,y,'.')
10 model=svmtrain(y,x,'s 3 t 2 c 8');
11 [predicted label, accuracy, descion score] = svmpredict(y, x, model);
12 hold on
13 plot(x,predicted label,'r*')
14 xlabel('x');
15 ylabel('y');
16 title('SVR nonlinear model');
1 clc
2 clear
3 load moore.mat;
4 data=moore;
5 label=data(:,end);
6 data=data(:,1:end1);
7 Mi=max(data);
8 mi=min(data);
9 m=size(data,1);
10 data=(datarepmat(mi,m,1))./(repmat(Mimi,m,1));
11
12 cs=15:15;
13 rs=zeros(length(cs)3,1);
14 MSEs=zeros(length(cs)3,1);
15 idx=0;
16 ijk=zeros(length(cs)3,3);
17 for i=1:length(cs)
18 for j=1:length(cs)
19 for k=1:length(cs)
20 idx=idx+1;
21 model = svmtrain(label,data, ['s 3 t 2 c ',num2str(2(cs(i))),' p ...
',num2str(2(cs(j))),' g ',num2str(2(cs(k)))]);
22 [predicted label, accuracy, descion score] = svmpredict(label, data, model);
23 MSEs(idx)=accuracy(2);
24 rs(idx)=accuracy(3);
25 ijk(idx,:)=[i,j,k];
26 end
27 end
28 end
How to extract features from fMRI data? This is the most important
question for multivariate analysis.
1 Use the fMRI time series directly. The number of features would be:
the number of voxels * the number of time points
3 Define ROIs, each ROI is measured as the mean contrast value of the
voxels within this ROI
4 Use ICA time series. The number of ICs * the number of time points.
1 Classification problem:
normal hearing (NH) vs.
hearing impaired (HI)