Build wheel for a package (like scipy) lacking dependency declaration

Midnighter picture Midnighter · Jun 22, 2014 · Viewed 80.6k times · Source

I think it doesn't make a difference here but I'm using Python 2.7.

So the general part of my question is the following: I use a separate virtualenv for each of my projects. I don't have administrator access and I don't want to mess with system-installed packages anyway. Naturally, I want to use wheels to speed up package upgrades and installations across the virtualenvs. How can I build a wheel whose dependencies are only met within a specific virtualenv?

Specifically, issuing

pip wheel -w $WHEELHOUSE scipy

fails with

Building wheels for collected packages: scipy
  Running setup.py bdist_wheel for scipy
  Destination directory: /home/moritz/.pip/wheelhouse
  Complete output from command /home/moritz/.virtualenvs/base/bin/python -c "import setuptools;__file__='/home/moritz/.virtualenvs/base/build/scipy/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" bdist_wheel -d /home/moritz/.pip/wheelhouse:
  Traceback (most recent call last):

  File "<string>", line 1, in <module>

  File "/home/moritz/.virtualenvs/base/build/scipy/setup.py", line 237, in <module>

    setup_package()

  File "/home/moritz/.virtualenvs/base/build/scipy/setup.py", line 225, in setup_package

    from numpy.distutils.core import setup

ImportError: No module named numpy.distutils.core

----------------------------------------
  Failed building wheel for scipy
Failed to build scipy
Cleaning up...

because numpy is not globally present and while building the wheel works when a virtualenv with numpy installed is active, it seems like a terrible idea to have the wheel depend on a specific virtualenv's version of numpy.

pandas which also depends on numpy appears to install its own components of numpy but I'm not sure that's the best solution.

I could install numpy with --user and use that to build the scipy wheel. Are there better options?

Answer

Jan Vlcinsky picture Jan Vlcinsky · Jun 22, 2014

Problem description

  • Have a python package (like scipy), which is dependent on other packages (like numpy) but setup.py is not declaring that requirement/dependency.
  • Building a wheel for such a package will succeed in case, current environment provides the package(s) which are needed.
  • In case, required packages are not available, building a wheel will fail.

Note: Ideal solution is to correct the broken setup.py by adding there required package declaration. But this is mostly not feasible and we have to go another way around.

Solution: Install required packages first

The procedure (for installing scipy which requires numpy) has two steps

  1. build the wheels
  2. use the wheels to install the package you need

Populate wheelhouse with wheels you need

This has to be done only once and can be then reused many times.

  1. have properly configured pip configuration so that installation from wheels is allowed, wheelhouse directory is set up and overlaps with download-cache and find-links as in following example of pip.conf:

    [global]
    download-cache = /home/javl/.pip/cache
    find-links = /home/javl/.pip/packages
    
    [install]
    use-wheel = yes
    
    [wheel]
    wheel-dir = /home/javl/.pip/packages
    
  2. install all required system libraries for all the packages, which have to be compiled

  3. build a wheel for required package (numpy)

    $ pip wheel numpy
    
  4. set up virtualenv (needed only once), activate it and install there numpy:

    $ pip install numpy
    

    As a wheel is ready, it shall be quick.

  5. build a wheel for scipy (still being in the virtualenv)

    $ pip wheel scipy
    

    By now, you will have your wheelhouse populated with wheels you need.

  6. You can remove the temporary virtualenv, it is not needed any more.

Installing into fresh virtualenv

I am assuming, you have created fresh virtualenv, activated it and wish to have scipy installed there.

Installing scipy from new scipy wheel directly would still fail on missing numpy. This we overcome by installing numpy first.

$ pip install numpy

And then finish with scipy

$ pip install scipy

I guess, this could be done in one call (but I did not test it)

$ pip install numpy scipy

Repeatedly installing scipy of proven version

It is likely, that at one moment in future, new release of scipy or numpy will be released and pip will attempt to install the latest version for which there is no wheel in your wheelhouse.

If you can live with the versions you have used so far, you shall create requirements.txt stating the versions of numpy and scipy you like and install from it.

This shall ensure needed package to be present before it is really used.