Does Cassandra support sharding?

Chris Dutrow picture Chris Dutrow · May 7, 2013 · Viewed 20.6k times · Source

Does Apache Cassandra support sharding?

Apologize that this question must seem trivial, but I cannot seem to find the answer. I have read that Cassandra was partially modeled after GAE's Big Table which shards on a massive scale. But most of the documentation I'm currently finding on Cassandra seems to imply that Cassandra does not partition data horizontally across multiple machines, but rather supports many many duplicate machines. This would imply that Cassandra is a good fit high availability reads, but would eventually break down if the write volume became very very high.

Answer

Matt Self picture Matt Self · May 7, 2013

Cassandra does partition across nodes (because if you can't split it you can't scale it). All of the data for a Cassandra cluster is divided up onto "the ring" and each node on the ring is responsible for one or more key ranges. You have control over the Partitioner (e.g. Random, Ordered) and how many nodes on the ring a key/column should be replicated to based on your requirements.

This contains a pretty good overview. Basic architecture

Also, I highly recommend reading the Dynamo white paper. While Cassandra is different than Dynamo in many ways, conceptually they stem from the same roots. Check it out: Dynamo White Paper