Why would I want to use a non-relational database?

Naftuli Kay picture Naftuli Kay · Oct 28, 2011 · Viewed 13.9k times · Source

The latest craze in databases seems to be centered around non-relational databases. Why? It seems kind of counterproductive. For example, it makes much more sense to me to express my data in a relational way (example code in Django + SQL for tables):

class Post(models.Model):
    name = models.CharField()
    created = models.DateTimeField(auto_now_create = True)

class Comment(models.Model):
    text = models.TextField()
    post = models.ForeignKey('Post')
    created = models.DateTimeField(auto_now_create = True)

SQL:

create table post (id int primary key auto_increment,
        name varchar,
        created datetime);

create table comment(id int primary key auto_increment,
        text text,
        post_id int,
        created datetime,
        foreign key post_id references post(id));

The power of SQL is that this information can be expressed in so many ways. Sure, the whole object-relational-mapping problem exists, but I look at it as a feature and not as a problem. With SQL, I can fetch all distinct comments of a given post which are older than yesterday, collate all of those together, and generate statistics. Can the same be done for non-relational databases?

It also would seem to really impact performance to use a non-relational database like MongoDB because you would immediately grab an entire object graph, rather than what you minimally need.

Can someone explain to me what the benefits are of using a non-relational database?

Answer

Nathan picture Nathan · Oct 29, 2011

Take a look at the CAP Theorem

And the PACELC interpretation

Relational databases tend to make one set of trade-offs, and non-relational tend to make a different set of trade-offs. For massive distributed datasets, non-relational sometimes makes more sense.

There is also a sense in which non-relational databases can eliminate a lot of the ORM pain, but again there are always tradeoffs. In some use cases, non-relational storage can be faster, because all the data for a particular hierarchy can be stored closer together on the disk. Also note that non-relational databases do still have query capabilities.

In the end, it's about making the appropriate set of trade-offs for your particular use-case.