How does Python keep track of modules installed with eggs?

Claudiu picture Claudiu · Nov 29, 2010 · Viewed 8.1k times · Source

If I have a module, foo, in Lib/site-packages, I can just import foo and it will work. However, when I install stuff from eggs, I get something like blah-4.0.1-py2.7-win32.egg as a folder, with the module contents inside, yet I still only need do import foo, not anything more complicated. How does Python keep track of eggs? It is not just dirname matching as if I drop that folder into a Python installation without going through dist-utils, it does not find the module.

To be clearer: I just installed zope. The folder name is "zope.interface-3.3.0-py2.7-win32.egg". This works:

Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import zope.interface
>>>

I create a "blah-4.0.1-py2.7-win32.egg" folder with an empty module "haha" in it (and __init__.py). This does not work:

Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import blah.haha
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named blah.haha
>>>

This does, though:

Python 2.7.1 (r271:86832, Nov 27 2010, 18:30:46) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from pkg_resources import require
>>> require("blah>=1.0")
[blah 4.0.1 (c:\python27\lib\site-packages\blah-4.0.1-py2.7-win32.egg)]
>>> import haha
>>>

So how do I make it work without a require?

Answer

Ned Deily picture Ned Deily · Nov 29, 2010

If you use the easy_install script provided by setuptools (or the Distribute fork of it) to install packages as eggs, you will see that, by default, it creates a file named easy-install.pth in the site-packages directory of your Python installation. Path configuration files are a standard feature of Python:

A path configuration file is a file whose name has the form package.pth and exists in one of the four directories mentioned above; its contents are additional items (one per line) to be added to sys.path.

easy_install makes heavy use of this Python feature. When you use easy_install to add or update a distribution, it modifies easy-install.pth to add the egg directory or zip file. In this way, easy_install maintains control of the module searching order and ensures that the eggs it installs appear early in the search order. Here is an example of the contents of an easy-install.pth:

import sys; sys.__plen = len(sys.path)
./appscript-0.21.1-py2.6-macosx-10.5-ppc.egg
./yolk-0.4.1-py2.6.egg
./Elixir-0.7.1-py2.6.egg
./Fabric-0.9.0-py2.6.egg
import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginse
rt',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)

As you can see here and if you examine the code in setuptools, you will find it goes to some trickery to bootstrap itself and then cover its tracks which can make debugging problems with site.py and interpreter startup a bit interesting. (That is one of the reasons that some developers are not fond of using it.)

If you use the -m parameter of easy_install to install a distribution as multi-version, the easy-install.pth entry for it is not added or is removed if it already exists. This is why the easy_install documentation tells you to use -m before deleting an installed egg.