Functionality of probability=TRUE in svm function of e1071 package in R

r svm
A.M. picture A.M. · Jun 13, 2014 · Viewed 20.6k times · Source

In R, what is the functionality of probability=TRUE in the svm function of the e1071 package?

model <- svm (Type ~ ., data, probability=TRUE, cost = 100, gamma = 1)

Answer

jbaums picture jbaums · Jun 13, 2014

Setting the probability argument to TRUE for both model fitting and prediction returns, for each prediction, the vector of probabilities of belonging to each class of the response variable. These are stored in a matrix, as an attribute of the prediction object.

For example:

library(e1071)

model <- svm(Species ~ ., data = iris, probability=TRUE)
# (below I'm just predicting to the training dataset - it could of course just 
# as easily be a separate test dataset)
pred <- predict(model, iris, probability=TRUE)

head(attr(pred, "probabilities"))

#      setosa versicolor   virginica
# 1 0.9803339 0.01129740 0.008368729
# 2 0.9729193 0.01807053 0.009010195
# 3 0.9790435 0.01192820 0.009028276
# 4 0.9750030 0.01531171 0.009685342
# 5 0.9795183 0.01164689 0.008834838
# 6 0.9740730 0.01679643 0.009130620

Note, however, that it's important to set probability=TRUE for the call to svm, and not just the call to predict, since the latter alone would produce:

#      setosa versicolor virginica
# 1 0.3333333  0.3333333 0.3333333
# 2 0.3333333  0.3333333 0.3333333
# 3 0.3333333  0.3333333 0.3333333
# 4 0.3333333  0.3333333 0.3333333
# 5 0.3333333  0.3333333 0.3333333
# 6 0.3333333  0.3333333 0.3333333