sklearn.ensemble.AdaBoostClassifier cannot accecpt SVM as base_estimator?

allenwang picture allenwang · Nov 24, 2014 · Viewed 15.6k times · Source

I am doing a text classification task. Now I want to use ensemble.AdaBoostClassifier with LinearSVC as base_estimator. However, when I try to run the code

clf = AdaBoostClassifier(svm.LinearSVC(),n_estimators=50, learning_rate=1.0,    algorithm='SAMME.R')
clf.fit(X, y)

An error occurred. TypeError: AdaBoostClassifier with algorithm='SAMME.R' requires that the weak learner supports the calculation of class probabilities with a predict_proba method

The first question is Cannot the svm.LinearSVC() calculate the class probabilities ? How to make it calculate the probabilities?

Then I Change the parameter algorithm and run the code again.

clf = AdaBoostClassifier(svm.LinearSVC(),n_estimators=50, learning_rate=1.0, algorithm='SAMME')
clf.fit(X, y)

This time TypeError: fit() got an unexpected keyword argument 'sample_weight' happens. As is said in AdaBoostClassifier, Sample weights. If None, the sample weights are initialized to 1 / n_samples. Even if I assign an integer to n_samples, error also occurred.

The second question is What does n_samples mean? How to solve this problem?

Hope anyone could help me.

According to @jme 's comment, however, after trying

clf = AdaBoostClassifier(svm.SVC(kernel='linear',probability=True),n_estimators=10,  learning_rate=1.0, algorithm='SAMME.R')
clf.fit(X, y)

The program cannot get a result and the memory used on the server keeps unchanged.

The third question is how I can make AdaBoostClassifier work with SVC as base_estimator?

Answer

kevin picture kevin · Jan 11, 2016

The right answer will depend on exactly what you're looking for. LinearSVC cannot predict class probabilities (required by default algorithm used by AdaBoostClassifier) and does not support sample_weight.

You should be aware that the Support Vector Machine does not nominally predict class probabilities. They are computed using Platt scaling (or an extension of Platt scaling in the multi-class case), a technique which has known issues. If you need less "artificial" class probabilities, an SVM might not be the way to go.

With that said, I believe the most satisfying answer given your question would be that given by Graham. That is,

from sklearn.svm import SVC
from sklearn.ensemble import AdaBoostClassifier

clf = AdaBoostClassifier(SVC(probability=True, kernel='linear'), ...)

You have other options. You can use SGDClassifier with a hinge loss function and set AdaBoostClassifier to use the SAMME algorithm (which does not require a predict_proba function, but does require support for sample_weight):

from sklearn.linear_model import SGDClassifier

clf = AdaBoostClassifier(SGDClassifier(loss='hinge'), algorithm='SAMME', ...)

Perhaps the best answer would be to use a classifier that has native support for class probabilities, like Logistic Regression, if you wanted to use the default algorithm provided for AdaBoostClassifier. You can do this using scikit.linear_model.LogisticRegression or using SGDClassifier with a log loss function, as used in the code provided by Kris.

Hope that helps, if you're curious about what Platt scaling is, check out the original paper by John Platt here.