Example of 10-fold SVM classification in MATLAB

Hossein picture Hossein · Jun 18, 2010 · Viewed 52k times · Source

I need a somehow descriptive example showing how to do a 10-fold SVM classification on a two class set of data. there is just one example in the MATLAB documentation but it is not with 10-fold. Can someone help me?

Answer

Amro picture Amro · Jun 18, 2010

Here's a complete example, using the following functions from the Bioinformatics Toolbox: SVMTRAIN, SVMCLASSIFY, CLASSPERF, CROSSVALIND.

load fisheriris                              %# load iris dataset
groups = ismember(species,'setosa');         %# create a two-class problem

%# number of cross-validation folds:
%# If you have 50 samples, divide them into 10 groups of 5 samples each,
%# then train with 9 groups (45 samples) and test with 1 group (5 samples).
%# This is repeated ten times, with each group used exactly once as a test set.
%# Finally the 10 results from the folds are averaged to produce a single 
%# performance estimation.
k=10;

cvFolds = crossvalind('Kfold', groups, k);   %# get indices of 10-fold CV
cp = classperf(groups);                      %# init performance tracker

for i = 1:k                                  %# for each fold
    testIdx = (cvFolds == i);                %# get indices of test instances
    trainIdx = ~testIdx;                     %# get indices training instances

    %# train an SVM model over training instances
    svmModel = svmtrain(meas(trainIdx,:), groups(trainIdx), ...
                 'Autoscale',true, 'Showplot',false, 'Method','QP', ...
                 'BoxConstraint',2e-1, 'Kernel_Function','rbf', 'RBF_Sigma',1);

    %# test using test instances
    pred = svmclassify(svmModel, meas(testIdx,:), 'Showplot',false);

    %# evaluate and update performance object
    cp = classperf(cp, pred, testIdx);
end

%# get accuracy
cp.CorrectRate

%# get confusion matrix
%# columns:actual, rows:predicted, last-row: unclassified instances
cp.CountingMatrix

with the output:

ans =
      0.99333
ans =
   100     1
     0    49
     0     0

we obtained 99.33% accuracy with only one 'setosa' instance mis-classified as 'non-setosa'


UPDATE: SVM functions have moved to Statistics toolbox in R2013a