Read Operation in Cassandra at Consistency level of Quorum?

brain storm picture brain storm · Jul 30, 2014 · Viewed 9.1k times · Source

I am reading this post on read operations and consistency level in Cassandra. According to this post:

For example, in a cluster with a replication factor of 3, and a read consistency level of QUORUM, 2 of the 3 replicas for the given row are contacted to fulfill the read request. Supposing the contacted replicas had different versions of the row, the replica with the most recent version would return the requested data. In the background, the third replica is checked for consistency with the first two, and if needed, the most recent replica issues a write to the out-of-date replicas.

So even with consistency level of Quorum, it is not guaranteed that you don't get a stale read. According to the above paragraph, if the third replica has the latest timestamp, the co-coordinator has already returned the latest timestamp of the two replicas it inquired. But it is not the latest since third replica has the latest timestamp.

Answer

Carlo Bertuccini picture Carlo Bertuccini · Jul 30, 2014

The QUORUM CL read does not guarantee the consistency of your data. What guarantees consistency is the following disequation

(WRITE CL + READ CL) > REPLICATION FACTOR

Translating the minimum W+R needed to guarantee data-consistency is

WRITE ALL + READ ONE
WRITE ONE + READ ALL
WRITE QUORUM + READ QUORUM

Like said in the post, if you have a Replication Factor of 3 and you wrote with CL1 surely 1 node have fresh information while other 2 might have old information. Asking cassandra a CL QUORUM read you might retrieve data from the other 2 nodes (old data), and get information back to the client. But since the coordinator sent the read request to all nodes (but waited only for 2 before sending back the response to the client) he will find out which node has the most fresh information and update other nodes.

Other, in a RF3 situation, if you write data in Quorum at least 2 nodes will have fresh information -- performing a read with CL QUORUM will invoke 2 of the 3 nodes, in this situation at least one of the two nodes have the fresh information.