Pickle versus shelve storing large dictionaries in Python

user248237 picture user248237 · Feb 3, 2013 · Viewed 18.3k times · Source

If I am storing a large directory as a pickle file, does loading it via cPickle mean that it will all be consumed into memory at once?

If so, is there a cross platform way to get something like pickle, but access each entry one key at a item (i.e. avoid loading all of the dictionary into memory and only load each entry by name)? I know shelve is supposed to do this: is that as portable as pickle though?

Answer

jimhark picture jimhark · Feb 3, 2013

I know shelve is supposed to do this: is that as portable as pickle though?

Yes. shelve is part of The Python Standard Library and is written in Python.

Edit

So if you have a large dictionary:

bigd = {'a': 1, 'b':2, # . . .
}

And you want to save it without having to read the whole thing in later then don't save it as a pickle, it would be better to save it as a shelf, a sort of on disk dictionary.

import shelve

myShelve = shelve.open('my.shelve')
myShelve.update(bigd)
myShelve.close()

Then later you can:

import shelve

myShelve = shelve.open('my.shelve')
value = myShelve['a']
value += 1
myShelve['a'] = value

You basically treat the shelve object like a dict, but the items are stored on disk (as individual pickles) and read in as needed.

If your objects could be stored as a list of properties, then sqlite may be a good alternative. Shelves and pickles are convenient, but can only be accessed by Python, but a sqlite database can by read from most languages.