Should conda, or conda-forge be used for Python environments?

tilikoom picture tilikoom · Oct 4, 2016 · Viewed 65.5k times · Source

Conda and conda-forge are both Python package managers. What is the appropriate choice when a package exists in both repositories? Django, for example, can be installed with either, but the difference between the two is several dependencies (conda-forge has many more). There is no explanation for these differences, not even a simple README.

Which one should be used? Conda or conda-forge? Does it matter?

Answer

darthbith picture darthbith · Oct 5, 2016

The short answer is that, in my experience generally, it doesn't matter which you use.

The long answer:

So conda-forge is an additional channel from which packages may be installed. In this sense, it is not any more special than the default channel, or any of the other hundreds (thousands?) of channels that people have posted packages to. You can add your own channel if you sign up at https://anaconda.org and upload your own Conda packages.

Here we need to make the distinction, which I think you're not clear about from your phrasing in the question, between conda, the cross-platform package manager, and conda-forge, the package channel. Anaconda Inc. (formerly Continuum IO), the main developers of the conda software, also maintain a separate channel of packages, which is the default when you type conda install packagename without changing any options.

There are three ways to change the options for channels. The first two are done every time you install a package and the last one is persistent. The first one is to specify a channel every time you install a package:

conda install -c some-channel packagename

Of course, the package has to exist on that channel. This way will install packagename and all its dependencies from some-channel. Alternately, you can specify:

conda install some-channel::packagename

The package still has to exist on some-channel, but now, only packagename will be pulled from some-channel. Any other packages that are needed to satisfy dependencies will be searched for from your default list of channels.

To see your channel configuration, you can write:

conda config --show channels

You can control the order that channels are searched with conda config. You can write:

conda config --add channels some-channel

to add the channel some-channel to the top of the channels configuration list. This gives some-channel the highest priority. Priority determines (in part) which channel is selected when more than one channel has a particular package. To add the channel to the end of the list and give it the lowest priority, type

conda config --append channels some-channel

If you would like to remove the channel that you added, you can do so by writing

conda config --remove channels some-channel

See

conda config -h

for more options.

With all of that said, there are four main reasons to use the conda-forge channel instead of the defaults channel maintained by Anaconda:

  1. Packages on conda-forge may be more up-to-date than those on the defaults channel
  2. There are packages on the conda-forge channel that aren't available from defaults
  3. You would prefer to use a dependency such as openblas (from conda-forge) instead of mkl (from defaults).
  4. If you are installing a package that requires a compiled library (e.g., a C extension or a wrapper around a C library), it may reduce the chance of incompatibilities if you install all of the packages in an environment from a single channel due to binary compatibility of the base C library (but this advice may be out of date/change in the future).