Python package dependency tree

MRocklin picture MRocklin · Mar 29, 2013 · Viewed 9.8k times · Source

I would like to analyze the dependency tree of Python packages. How can I obtain this data?

Things I already know

  1. setup.py sometimes contains a requires field that lists package dependencies
  2. PyPi is an online repository of Python packages
  3. PyPi has an API

Things that I don't know

  1. Very few projects (around 10%) on PyPi explicitly list dependencies in the requires field but pip/easy_install still manage to download the correct packages. What am I missing? For example the popular library for statistical computing, pandas, doesn't list requires but still manages to install numpy, pytz, etc.... Is there a better way to automatically collect the full list of dependencies?
  2. Is there a pre-existing database somewhere? Am I repeating existing work?
  3. Do similar, easily accessible, databases exist for other languages with distribution systems (R, Clojure, etc...?)

Answer

Martijn Pieters picture Martijn Pieters · Mar 29, 2013

You should be looking at the install_requires field instead, see New and changed setup keywords.

requires is deemed too vague a field to rely on for dependency installation. In addition, there are setup_requires and test_requires fields for dependencies required for setup.py and for running tests.

Certainly, the dependency graph has been analyzed before; from this blog article by Olivier Girardot comes this fantastic image:

PyPI dependencies
The image is linked to the interactive version of the graph.