In sklearn what is the difference between a SVM model with linear kernel and a SGD classifier with loss=hinge

machine-learning scikit-learn svm

JackNova · Apr 17, 2015 · Viewed 7.1k times · Source

I see that in scikit-learn I can build an SVM classifier with linear kernel in at last 3 different ways:

LinearSVC
SVC with kernel='linear' parameter
Stochastic Gradient Descent with loss='hinge' parameter

Now, I see that the difference between the first two classifiers is that the former is implemented in terms of liblinear and the latter in terms of libsvm.

How the first two classifiers differ from the third one?

Answer

The first two always use the full data and solve a convex optimization problem with respect to these data points.

The latter can treat the data in batches and performs a gradient descent aiming to minimize expected loss with respect to the sample distribution, assuming that the examples are iid samples of that distribution.

The latter is typically used when the number of samples is very big or not ending. Observe that you can call the partial_fit function and feed it chunks of data.

Hope this helps?

In sklearn what is the difference between a SVM model with linear kernel and a SGD classifier with loss=hinge

Answer

Related questions