I am dealing with a legacy Dockerfile. Here is a very simplified version of what I am dealing with:
FROM ubuntu:14.04
RUN apt-get -y update && apt-get -y install \
python-pip \
python-numpy # ...and many other packages
RUN pip install -U pip
RUN pip install -r /tmp/requirements1.txt # includes e.g., numpy==1.13.0
RUN pip install -r /tmp/requirements2.txt
RUN pip install -r /tmp/requirements3.txt
First, several packages are installed using apt
, and then several packages are installed using pip
. pip
version 10 has been released, and part of the release is this new restriction:
Removed support for uninstalling projects which have been installed using distutils. distutils installed projects do not include metadata indicating what files belong to that install and thus it is impossible to actually uninstall them rather than just remove the metadata saying they've been installed while leaving all of the actual files behind.
This leads to the following problem in my setup. For example, first apt
installs python-numpy
. Later pip
tries to install a newer version of numpy
from e.g., /tmp/requirements1.txt
, and tries to uninstall the older version, but because of the new restriction, it cannot remove this version:
Installing collected packages: numpy
Found existing installation: numpy 1.8.2
Cannot uninstall 'numpy'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
Now I know at this point there are several solutions.
I could not install python-numpy
through apt
. However, this causes issues because python-numpy
installs a few different packages as requirements, and I do not know if another part of the system relies on these packages. And in reality, there are several apt
packages installed through the Dockerfile, and each one I remove seems to reveal another Cannot uninstall X
error, and removes a number of other packages along with it, that our app may or may not rely on.
I could also use the --ignore-installed
option when I try to pip
install things that have already been installed through apt
, but then again I have the same problem of every --ignore-installed
argument revealing yet another thing that needs to be ignored.
I could pin pip
at an older version that does not have this restriction, but I don't want to be stuck using an outdated version of pip
forever.
I have been going around in circles trying to come up with a good solution that involves minimal changes to this legacy Dockerfile, and allows the app we deploy with that file to continue to function as it has been. Any suggestions as to how I can safely get around this problem of pip
10 not being able to install newer versions of distutils
packages? Thank you!
I did not realize that --ignore-installed
could be used without a package as an argument to ignore all installed packages. I am considering whether or not this might be a good option for me, and have asked about it here.
This is the solution I ended up going with, and our apps have been running in production without any issues for close to a month with this fix in place:
All I had to do was to add
--ignore-installed
to the pip install
lines in my dockerfile that were raising errors. Using the same dockerfile example from my original question, the fixed dockerfile would look something like:
FROM ubuntu:14.04
RUN apt-get -y update && apt-get -y install \
python-pip \
python-numpy # ...and many other packages
RUN pip install -U pip
RUN pip install -r /tmp/requirements1.txt --ignore-installed # don't try to uninstall existing packages, e.g., numpy
RUN pip install -r /tmp/requirements2.txt
RUN pip install -r /tmp/requirements3.txt
The documentation I could find for --ignore-installed
was unclear in my opinion (pip install --help
simply says "Ignore the installed packages (reinstalling instead)."), and I asked about the potential dangers of this flag here, but have yet to get satisfying answer. However, if there are any negative side effects, our production environment has yet to see the effects of them, and I think the risk is low/none (at least that has been our experience). I was able to confirm that in our case, when this flag was used, the existing installation was not uninstalled, but that the newer installation was always used.
I wanted to highlight this answer by @ivan_pozdeev. He provides some information that this answer does not include, and he also outlines some potential side-effects of my solution.