What is the best open source solution for storing time series data?

lorg picture lorg · Aug 26, 2009 · Viewed 13.5k times · Source

I am interested in monitoring some objects. I expect to get about 10000 data points every 15 minutes. (Maybe not at first, but this is the 'general ballpark'). I would also like to be able to get daily, weekly, monthly and yearly statistics. It is not critical to keep the data in the highest resolution (15 minutes) for more than two months.

I am considering various ways to store this data, and have been looking at a classic relational database, or at a schemaless database (such as SimpleDB).

My question is, what is the best way to go along doing this? I would very much prefer an open-source (and free) solution to a proprietary costly one.

Small note: I am writing this application in Python.

Answer

tom10 picture tom10 · Aug 26, 2009

HDF5, which can be accessed through h5py or PyTables, is designed for dealing with very large data sets. Both interfaces work well. For example, both h5py and PyTables have automatic compression and supports Numpy.