I would like to build pandas from source rather than use a package manager because I am interested in contributing. The first time I tried to build pandas, these were the steps I took:
1) created the virtualenv
mkvirtualenv --no-site-packages pandas
2) activated the virtualenv
3) installed Anaconda CE. However, this was installed in ~/anaconda.
4) cloned pandas
5) built C extensions in place
(pandas)ems ~/.virtualenvs/pandas/localrepo/pandas> ~/anaconda/bin/python setup.py build_ext --inplace
6) built pandas
(pandas)ems ~/.virtualenvs/pandas/localrepo/pandas> ~/anaconda/bin/python setup.py build
7) ran nosetests on master branch
Tests failed: (pandas)ems ~/.virtualenvs/pandas/localrepo/pandas> nosetests pandas E ====================================================================== ERROR: Failure: ValueError (numpy.dtype has the wrong size, try recompiling) ---------------------------------------------------------------------- Traceback (most recent call last): File "/Users/EmilyChen/.virtualenvs/pandas/lib/python2.7/site-packages/nose/loader.py", line 390, in loadTestsFromName addr.filename, addr.module) File "/Users/EmilyChen/.virtualenvs/pandas/lib/python2.7/site-packages/nose/importer.py", line 39, in importFromPath return self.importFromDir(dir_path, fqname) File "/Users/EmilyChen/.virtualenvs/pandas/lib/python2.7/site-packages/nose/importer.py", line 86, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File "/Users/EmilyChen/.virtualenvs/pandas/localrepo/pandas/pandas/init.py", line 6, in from . import hashtable, tslib, lib File "numpy.pxd", line 156, in init pandas.hashtable (pandas/hashtable.c:20354) ValueError: numpy.dtype has the wrong size, try recompiling
Ran 1 test in 0.001s
FAILED (errors=1)
Someone on the PyData mailing list said:
It looks like you have NumPy installed someplace else on your machine and AnacondaCE is not playing nicely in the virtualenv. The error you are getting is a Cython error message which occurs when the NumPy version it built against doesn't match the installed version on your system-- I had thought that 1.7.x was supposed to be ABI compatible with 1.6.x (so this would not happen) but I guess not. Sigh
The numpy version in Anaconda CE library is 1.7.0b2 and my system numpy installation is version 1.5.1. Setup.py linked to the numpy in the Anaconda distribution's libraries when it built pandas but my guess is it's linking to my system version when nosetests runs /pandas/init.py
Next, I repeated the steps outside a virtualenv, but got the same error. Finally, I decided to install all the dependencies in a new virtualenv instead of using the Anaconda distribution to build pandas. This way, I can see that dependencies like numpy reside in the lib directory of the virtualenv python installation, which takes precedent when pandas.init runs import statements. This is what I did:
1) installed numpy, dateutil, pytz, cython, scipy, matplotlib and openpyxl using pip
2) built c extensions in place
3) pandas install output here: http://pastebin.com/3CKf1f9i
4) pandas did not install correctly
(pandas)ems ~/.virtualenvs/pandas/localrepo/pandas> python
Python 2.7.1 (r271:86832, Jul 31 2011, 19:30:53)
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
cannot import name hashtable
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pandas/__init__.py", line 6, in <module>
from . import hashtable, tslib, lib
ImportError: cannot import name hashtable
I took a look at this question but cython installed in my case, and I am trying to build successfully from source rather than using pip like the answer recommended..
(pandas)ems ~/.virtualenvs/pandas/localrepo/pandas> which cython
/Users/EmilyChen/.virtualenvs/pandas/bin/cython
I've received the same error (ImportError: cannot import name hashtable
) when trying to import pandas from the source code directory. Try starting the python interpreter from a different directory and import pandas again.