HBase cassandra couchdb mongodb..any fundamental difference?

Federico picture Federico · Sep 6, 2010 · Viewed 7.7k times · Source

I just wanted to know if there is a fundamental difference between hbase, cassandra, couchdb and monogodb ? In other words, are they all competing in the exact same market and trying to solve the exact same problems. Or they fit best in different scenarios?

All this comes to the question, what should I chose when. Matter of taste?

Thanks,

Federico

Answer

Gates VP picture Gates VP · Sep 7, 2010

Those are some long answers from @Bohzo. (but they are good links)

The truth is, they're "kind of" competing. But they definitely have different strengths and weaknesses and they definitely don't all solve the same problems.

For example Couch and Mongo both provide Map-Reduce engines as part of the main package. HBase is (basically) a layer over top of Hadoop, so you also get M-R via Hadoop. Cassandra is highly focused on being a Key-Value store and has plug-ins to "layer" Hadoop over top (so you can map-reduce).

Some of the DBs provide MVCC (Multi-version concurrency control). Mongo does not.

All of these DBs are intended to scale horizontally, but they do it in different ways. All of these DBs are also trying to provide flexibility in different ways. Flexible document sizes or REST APIs or high redundancy or ease of use, they're all making different trade-offs.

So to your question: In other words, are they all competing in the exact same market and trying to solve the exact same problems?

  1. Yes: they're all trying to solve the issue of database-scalability and performance.
  2. No: they're definitely making different sets of trade-offs.

What should you start with?

Man, that's a tough question. I work for a large company pushing tons of data and we've been through a few years. We tried Cassandra at one point a couple of years ago and it couldn't handle the load. We're using Hadoop everywhere, but it definitely has a steep learning curve and it hasn't worked out in some of our environments. More recently we've tried to do Cassandra + Hadoop, but it turned out to be a lot of configuration work.

Personally, my department is moving several things to MongoDB. Our reasons for this are honestly just simplicity.

Setting up Mongo on a linux box takes minutes and doesn't require root access or a change to the file system or anything fancy. There are no crazy config files or java recompiles required. So from that perspective, Mongo has been the easiest "gateway drug" for getting people on to KV/Document stores.