Can Druid replace Cassandra?

TechJack picture TechJack · Jan 7, 2015 · Viewed 8.3k times · Source

I cant help think that there aren't many use case that can be effectively served by Cassandra better than Druid. As a time series store or key value, queries can be written in Druid to extract data however needed. The argument here is more around justifying Druid than Cassandra.

Apart from the Fast writes in Cassandra, is there really anything else ? Esp given the real time aggregations/and querying capabilities of Druid, does it not outweigh Cassandra.

For a more straight question that can be answered - doesnt Druid provide a superset of features as comapred to Cassandra and wouldn't one be better off in using druid rightaway? For all use cases?

Answer

Nylon Smile picture Nylon Smile · Mar 2, 2015

For a more straight question that can be answered - doesnt Druid provide a superset of features as comapred to Cassandra and wouldn't one be better off in using druid rightaway? For all use cases?

Not at all, they are not comparable. We are talking about two very different technologies here. Easy way is to see Cassandra as a distributed storage solution, but Druid a distributed aggregator (i.e. an awesome open-source OLAP-like tool (: ). The post you are referring to, in my opinion, is a bit misleading in the sense that it compares the two projects in the world of data mining, which is not cassandra's focus.

Druid is not good at point lookup, at all. It loves time series and its partitioning is mainly based on date-based segments (e.g. hourly/monthly etc. segments that may be furthered sharded based on size).

Druid pre-aggregates your data based on pre-defined aggregators -- which are numbers (e.g. Sum the number of click events in your website with a daily granularity, etc.). If one wants to store a key lookup from a string to say another string or an exact number, Druid is the worst solution s/he can look for.