Differences in SciKit Learn, Keras, or Pytorch

MaxPi picture MaxPi · Feb 5, 2019 · Viewed 17k times · Source

Are these libraries fairly interchangeable?

Looking here, https://stackshare.io/stackups/keras-vs-pytorch-vs-scikit-learn, it seems the major difference is the underlying framework (at least for PyTorch).

Answer

Jatentaki picture Jatentaki · Feb 5, 2019

Yes, there is a major difference.

SciKit Learn is a general machine learning library, built on top of NumPy. It features a lot of machine learning algorithms such as support vector machines, random forests, as well as a lot of utilities for general pre- and postprocessing of data. It is not a neural network framework.

PyTorch is a deep learning framework, consisting of

  1. A vectorized math library similar to NumPy, but with GPU support and a lot of neural network related operations (such as softmax or various kinds of activations)
  2. Autograd - an algorithm which can automatically calculate gradients of your functions, defined in terms of the basic operations
  3. Gradient-based optimization routines for large scale optimization, dedicated to neural network optimization
  4. Neural-network related utility functions

Keras is a higher-level deep learning framework, which abstracts many details away, making code simpler and more concise than in PyTorch or TensorFlow, at the cost of limited hackability. It abstracts away the computation backend, which can be TensorFlow, Theano or CNTK. It does not support a PyTorch backend, but that's not something unfathomable - you can consider it a simplified and streamlined subset of the above.

In short, if you are going with "classic", non-neural algorithms, neither PyTorch nor Keras will be useful for you. If you're doing deep learning, scikit-learn may still be useful for its utility part; aside from it you will need the actual deep learning framework, where you can choose between Keras and PyTorch but you're unlikely to use both at the same time. This is very subjective, but in my view, if you're working on a novel algorithm, you're more likely to go with PyTorch (or TensorFlow or some other lower-level framework) for flexibility. If you're adapting a known and tested algorithm to a new problem setting, you may want to go with Keras for its greater simplicity and lower entry level.