*large* python dictionary with persistence storage for quick look-ups

Abhi picture Abhi · Aug 7, 2012 · Viewed 9.7k times · Source

I have a 400 million lines of unique key-value info that I would like to be available for quick look ups in a script. I am wondering what would be a slick way of doing this. I did consider the following but not sure if there is a way to disk map the dictionary and without using a lot of memory except during dictionary creation.

  1. pickled dictionary object : not sure if this is an optimum solution for my problem
  2. NoSQL type dbases : ideally want something which has minimum dependency on third party stuff plus the key-value are simply numbers. If you feel this is still the best option, I would like to hear that too. May be it will convince me.

Please let me know if anything is not clear.

Thanks! -Abhi

Answer

Sam Mussmann picture Sam Mussmann · Aug 7, 2012

If you want to persist a large dictionary, you are basically looking at a database.

Python comes with built in support for sqlite3, which gives you an easy database solution backed by a file on disk.