Fast embedded database

CmdrMoozy picture CmdrMoozy · Jul 3, 2013 · Viewed 8.5k times · Source

I am working on an application which will need to store metadata associated with music files (artist, title, play count, etc.), as well as sets of integers (in particular, SHA-1 hashes).

The solution I pick needs to:

  • Provide "fast" storage & retrieval (when viewing a list of potentially thousands of songs I need to be able to retrieve metadata more or less interactively).
  • Be cross-platform (to Linux, Windows and OSX).
  • Provide an interface I can interact with from C++.
  • Be open-source (or, at the very least, be free as in beer).
  • Provide fast set operations (union, intersection, difference) - if the solution doesn't provide this, but it will allow me to store binary data, I could implement this myself using a technique like "Fast Set Operations Using Treaps".
  • Be "embedded" - that is, operate without me having to fork another process, or at least provide an easy interface to do so (like libmysqld).

Solutions I have considered include:

  • Flat files. This is extremely simple, but doesn't provide any features besides flat data storage.
  • SQlite. This seems to be a very popular option, but it seems to have some issues regarding performance and concurrency (see KDE's Akonadi, for some example issues).
  • Embedded MySQL/MariaDB. This seems to be a reasonable option, but it also might be a bit heavyweight considering I won't be needing a lot of complicated SQL features.

A hypothetical solution I'm thinking would be perfect would be something like Redis, but which persists data to the disk, and only stores some portion of the data in memory to make retrieval fast. Redis itself might not be a good option because 1) I would need to fork it manually, 2) its Windows port seems less than rock-solid, and 3) storing all of my data in RAM would be less than ideal.

Are there any other solutions for this type of problem, or is one of the solutions I have already listed far better than the others?

Answer

CmdrMoozy picture CmdrMoozy · Jul 19, 2013

In the end, I've decided to use SQlite for metadata. It seems to be as fast if not faster than e.g. libmysqld, and it has a really simple clean C interface. According to benchmarks, it should be way more than fast enough to suit my needs.

For larger data structures, I'm planning on just storing them in separate binary files (the SQlite website says it can store binary data, but that if your data size exceeds a certain amount it is faster to store it in flat files instead - see this page).