Methods to Verify Cassandra Node Sync

stephon picture stephon · Mar 27, 2012 · Viewed 13.4k times · Source

I have a 3 node Cassandra cluster with replication factor of 2. Because one of the nodes has been replaced with a new one. And I have used "nodetool repair" to repair all the keyspaces. But don't know how to verify that all the keyspaces are synced.

Before, Just found this article would help, but a little. Cassandra Data Replication problem

Is there any way to verify the keyspaces with replication factor > 1 in Cassandra?

Thanks a lot.

stephon

Answer

Tyler Hobbs picture Tyler Hobbs · Mar 28, 2012

First, if you run nodetool repair again and very little data is transferred (assuming all nodes have been up since the last time you ran), you know that the data is almost perfectly in sync. You can look at the logs to see numbers on how much data is transferred during this process.

Second, you can verify that all of the nodes are getting a similar number of writes by looking at the write counts with nodetool cfstats. Note that the write count value is reset each time Cassandra restarts, so if they weren't restarted around the same time, you'll have to see how quickly they are each increasing over time.

Last, if you just want to spot check a few recently updated values, you can try reading those values at consistency level ONE. If you always get the most up-to-date version of the data, you'll know that the replicas are likely in sync.

As a general note, replication is such an ingrained part of Cassandra that it's extremely unlikely to fail on its own without you noticing. Typically a node will be marked down shortly after problems start. Also, I'm assuming you're writing at consistency level ONE or ANY; with anything higher, you know for sure that both of the replicas have received the write.