Predict probabilities using SVM

Vidya Marathe picture Vidya Marathe · Mar 27, 2018 · Viewed 8.9k times · Source

I wrote this code and wanted to obtain probabilities of classification.

from sklearn import svm
X = [[0, 0], [10, 10],[20,30],[30,30],[40, 30], [80,60], [80,50]]
y = [0, 1, 2, 3, 4, 5, 6]
clf = svm.SVC() 
clf.probability=True
clf.fit(X, y)
prob = clf.predict_proba([[10, 10]])
print prob

I obtained this output:

[[0.15376986 0.07691205 0.15388546 0.15389275 0.15386348 0.15383004 0.15384636]]

which is very weird because the probability should have been

[0 1 0 0 0 0 0 0]

(Observe that the sample for which class has to be predicted is same as 2nd sample) also, probability obtained for that class is the lowest.

Answer

Tim picture Tim · Mar 27, 2018

You should disable probability and use decision_function instead, because there is no guarantee that predict_proba and predict return the same result. You can read more about it, here in the documentation.

clf.predict([[10, 10]]) // returns 1 as expected 

prop = clf.decision_function([[10, 10]]) // returns [[ 4.91666667  6.5         3.91666667  2.91666667  1.91666667  0.91666667
      -0.08333333]]
prediction = np.argmax(prop) // returns 1