When to use CouchDB vs RDBMS

Andrew Whitehouse picture Andrew Whitehouse · Aug 20, 2009 · Viewed 36.6k times · Source

I am looking at CouchDB, which has a number of appealing features over relational databases including:

  • intuitive REST/HTTP interface
  • easy replication
  • data stored as documents, rather than normalised tables

I appreciate that this is not a mature product so should be adopted with caution, but am wondering whether it is actually a viable replacement for an RDBMS (in spite of the intro page saying otherwise - http://couchdb.apache.org/docs/intro.html).

  1. Under what circumstances would CouchDB be a better choice of database than an RDBMS (e.g. MySQL), e.g. in terms of scalability, design + development time, reliability and maintenance.
  2. Are there still cases where an RDBMS is still clearly the right choice?
  3. Is this an either-or choice, or is a hybrid solution more likely to emerge as best practice?

Answer

Andrew Whitehouse picture Andrew Whitehouse · Apr 28, 2010

I recently attended the NoSQL conference in London and think I have a better idea now how to answer the original question. I also wrote a blog post, and there are a couple of other good ones.

Key points:

  • We have accumulated probably 30 years knowledge of adminstering relational databases, so shouldn't replace them without careful consideration; non-relational data stores are less mature than relational ones, and so are inherently more risky to adopt
  • There are different types of non-relational data store; some are key-value stores, some are document stores, some are graph databases
  • You could use a hybrid approach, e.g. a combination of RDBMS and graph data store for a social software site
  • Document data stores (e.g. CouchDB and MongoDB) are probably the closest to relational databases and provide a JSON data structure with all the fields presented hierarchically which avoids having to do table joins, and (some might argue) is an improvement on the traditional object-relational mapping that most applications currently use
  • Non-relational databases support replication (including master-master); relational databases support replication too but it may not be as comprehensive as the non-relational option
  • Very large sites such as Twitter, Digg and Facebook use Cassandra, which is built from the ground up to support clustering
  • Relational databases are probably suitable for 90% of cases

In summary, consensus seems to be "proceed with caution".