Load Python 2 .npy file in Python 3

Frozen Flame picture Frozen Flame · Jun 8, 2014 · Viewed 8.4k times · Source

I'm trying to load /usr/share/matplotlib/sample_data/goog.npy:

datafile = matplotlib.cbook.get_sample_data('goog.npy', asfileobj=False)
np.load(datafile)

It's fine in Python 2.7, but raises an exception in Python 3.4:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xd4 in position 1: ordinal not in range(128)

I assume it has something to do with bytes/str/unicode inconsistency between Python 2 and 3, but have no idea how to get through.

Question:

  • How to load a .npy file (NumPy data) from Python 2 in Python 3?

Answer

pv. picture pv. · Jun 9, 2014

The problem is that the file contains serialized (pickled) Python datetime objects, and not just numerical data. The Python serialization format for these objects is not compatible across Py2 to Py3:

python2
>>> import pickle
>>> pickle.dumps(datetime.datetime.now())
"cdatetime\ndatetime\np0\n(S'\\x07\\xde\\x06\\t\\x0c\\r\\x19\\x0f\\x1fP'\np1\ntp2\nRp3\n."

and

python3
>>> import pickle
>>> pickle.loads(b"cdatetime\ndatetime\np0\n(S'\\x07\\xde\\x06\\t\\x0c\\r\\x19\\x0f\x1fP'\np1\ntp2\nRp3\n.")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xde in position 1: ordinal not in range(128)

A workaround is to change inside Numpy code

numpy/lib/format.py:
...
446         array = pickle.load(fp)

to array = pickle.load(fp, encoding="bytes"). A better solution would be to allow numpy.load pass on the encoding parameter.