Usecases: InfluxDB vs. Prometheus

SpaceMonkey picture SpaceMonkey · Oct 26, 2015 · Viewed 42.4k times · Source

Following the Prometheus webpage one main difference between Prometheus and InfluxDB is the usecase: while Prometheus stores time series only InfluxDB is better geared towards storing individual events. Since there was some major work done on the storage engine of InfluxDB I wonder if this is still true.

I want to setup a time series database and apart from the push/push model (and probably a difference in performance) I can see no big thing which separates both projects. Can someone explain the difference in usecases?

Answer

Paul Dix picture Paul Dix · Oct 28, 2015

InfluxDB CEO and developer here. The next version of InfluxDB (0.9.5) will have our new storage engine. With that engine we'll be able to efficiently store either single event data or regularly sampled series. i.e. Irregular and regular time series.

InfluxDB supports int64, float64, bool, and string data types using different compression schemes for each one. Prometheus only supports float64.

For compression, the 0.9.5 version will have compression competitive with Prometheus. For some cases we'll see better results since we vary the compression on timestamps based on what we see. Best case scenario is a regular series sampled at exact intervals. In those by default we can compress 1k points timestamps as an 8 byte starting time, a delta (zig-zag encoded) and a count (also zig-zag encoded).

Depending on the shape of the data we've seen < 2.5 bytes per point on average after compactions.

YMMV based on your timestamps, the data type, and the shape of the data. Random floats with nanosecond scale timestamps with large variable deltas would be the worst, for instance.

The variable precision in timestamps is another feature that InfluxDB has. It can represent second, millisecond, microsecond, or nanosecond scale times. Prometheus is fixed at milliseconds.

Another difference is that writes to InfluxDB are durable after a success response is sent to the client. Prometheus buffers writes in memory and by default flushes them every 5 minutes, which opens a window of potential data loss.

Our hope is that once 0.9.5 of InfluxDB is released, it will be a good choice for Prometheus users to use as long term metrics storage (in conjunction with Prometheus). I'm pretty sure that support is already in Prometheus, but until the 0.9.5 release drops it might be a bit rocky. Obviously we'll have to work together and do a bunch of testing, but that's what I'm hoping for.

For single server metrics ingest, I would expect Prometheus to have better performance (although we've done no testing here and have no numbers) because of their more constrained data model and because they don't append writes to disk before writing out the index.

The query language between the two are very different. I'm not sure what they support that we don't yet or visa versa so you'd need to dig into the docs on both to see if there's something one can do that you need. Longer term our goal is to have InfluxDB's query functionality be a superset of Graphite, RRD, Prometheus and other time series solutions. I say superset because we want to cover those in addition to more analytic functions later on. It'll obviously take us time to get there.

Finally, a longer term goal for InfluxDB is to support high availability and horizontal scalability through clustering. The current clustering implementation isn't feature complete yet and is only in alpha. However, we're working on it and it's a core design goal for the project. Our clustering design is that data is eventually consistent.

To my knowledge, Prometheus' approach is to use double writes for HA (so there's no eventual consistency guarantee) and to use federation for horizontal scalability. I'm not sure how querying across federated servers would work.

Within an InfluxDB cluster, you can query across the server boundaries without copying all the data over the network. That's because each query is decomposed into a sort of MapReduce job that gets run on the fly.

There's probably more, but that's what I can think of at the moment.