Recommend a fast & scalable persistent Map - Java

Joel picture Joel · Oct 8, 2009 · Viewed 21.1k times · Source

I need a disk backed Map structure to use in a Java app. It must have the following criteria:

  1. Capable of storing millions of records (even billions)
  2. Fast lookup - the majority of operations on the Map will simply to see if a key already exists. This, and 1 above are the most important criteria. There should be an effective in memory caching mechanism for frequently used keys.
  3. Persistent, but does not need to be transactional, can live with some failure. i.e. happy to synch with disk periodically, and does not need to be transactional.
  4. Capable of storing simple primitive types - but I don't need to store serialised objects.
  5. It does not need to be distributed, i.e. will run all on one machine.
  6. Simple to set up & free to use.
  7. No relational queries required

Records keys will be strings or longs. As described above reads will be much more frequent than writes, and the majority of reads will simply be to check if a key exists (i.e. will not need to read the keys associated data). Each record will be updated once only and records are not deleted.

I currently use Bdb JE but am seeking other options.


Update

Have since improved query performance on my existing BDB setup by reducing the dependency on secondary keys. Some queries required a join on two secondary keys and by combining them into a composite key I removed a level of indirection in the lookup which speeds things up nicely.

Answer

Andrejs picture Andrejs · Feb 23, 2012

JDBM3 does exactly what you are looking for. It is a library of disk backed maps with really simple API and high performance.

UPDATE

This project has now evolved into MapDB http://www.mapdb.org