Cassandra: Removing a node

keelar picture keelar · Aug 15, 2013 · Viewed 11.7k times · Source

I would like to remove a node from my Cassandra cluster and am following these two related questions (here and here) and the Cassandra document. But I am still not quite sure the exact process.

My first question is: Is the following way to remove a node from a Cassandra cluster correct?

  1. decommission the node that I would like to remove.
  2. removetoken the node that I just decommissioned.

If the above process is right, then how can I tell the decommission process is completed so that I can proceed to the second step? or is it always safe to do step 2 right after step 1?

In addition, Cassandra document says:

You can take a node out of the cluster with nodetool decommission to a live node, or nodetool removetoken (to any other machine) to remove a dead one. This will assign the ranges the old node was responsible for to other nodes, and replicate the appropriate data there. If decommission is used, the data will stream from the decommissioned node. If removetoken is used, the data will stream from the remaining replicas.

No data is removed automatically from the node being decommissioned, so if you want to put the node back into service at a different token on the ring, it should be removed manually.

Does this mean a decommissioned node is a dead node? In addition, as no data is removed automatically from the node being decommissioned, how can I tell when it is safe to remove the data from the decommissioned node (i.e., how to know when the data-streaming is completed?)

Answer

keelar picture keelar · Aug 15, 2013

Removing a node from a Cassandra cluster should be the following steps (in Cassandra v1.2.8):

  1. Decommission the target node by nodetool decommission.
  2. Once the data streaming from the decommissioned node is completed, manually delete the data in the decommissioned node (optional).

From the docs:

nodetool decommission - Decommission the *node I am connecting to*

Update: The above process also works for seed nodes. In such case, the cluster is still able to run smoothly without requiring an restart. When you need to restart the cluster for other reasons, be sure to update the seeds parameter specified in the cassandra.yaml for all nodes.


Decommission the target node

When decommission starts, the decommissioned node will be first labeled as leaving (marked as L). In the following example, we will remove node-76:

> nodetool -host node-76 decommission
> nodetool status

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load       Tokens  Owns   Host ID                               Rack
UN  node-70  9.79 GB    256     8.3%   e0a7fb7a-06f8-4f8b-882d-c60bff51328a  155
UN  node-80  8.9 GB     256     9.2%   43dfc22e-b838-4b0b-9b20-66a048f73d5f  155
UN  node-72  9.47 GB    256     9.2%   75ebf2a9-e83c-4206-9814-3685e5fa0ab5  155
UN  node-71  9.48 GB    256     9.5%   cdbfafef-4bfb-4b11-9fb8-27757b0caa47  155
UN  node-91  8.05 GB    256     8.4%   6711f8a7-d398-4f93-bd73-47c8325746c3  155
UN  node-78  9.11 GB    256     9.4%   c82ace5f-9b90-4f5c-9d86-0fbfb7ac2911  155
UL  node-76  8.36 GB    256     9.5%   15d74e9e-2791-4056-a341-c02f6614b8ae  155
UN  node-73  9.36 GB    256     8.9%   c1dfab95-d476-4274-acac-cf6630375566  155
UN  node-75  8.93 GB    256     8.2%   8789d89d-2db8-4ddf-bc2d-60ba5edfd0ad  155
UN  node-74  8.91 GB    256     9.6%   581fd5bc-20d2-4528-b15d-7475eb2bf5af  155
UN  node-79  9.71 GB    256     9.9%   8e192e01-e8eb-4425-9c18-60279b9046ff  155

When a decommissioned node is marked as leaving, it is streaming data to the other living nodes. Once the streaming is completed, the node will not be observed from the ring structure, and the data owned by the other nodes will increase:

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load       Tokens  Owns   Host ID                               Rack
UN  node-70  9.79 GB    256     9.3%   e0a7fb7a-06f8-4f8b-882d-c60bff51328a  155
UN  node-80  8.92 GB    256     9.6%   43dfc22e-b838-4b0b-9b20-66a048f73d5f  155
UN  node-72  9.47 GB    256     10.2%  75ebf2a9-e83c-4206-9814-3685e5fa0ab5  155
UN  node-71  9.69 GB    256     10.6%  cdbfafef-4bfb-4b11-9fb8-27757b0caa47  155
UN  node-91  8.05 GB    256     9.1%   6711f8a7-d398-4f93-bd73-47c8325746c3  155
UN  node-78  9.11 GB    256     10.5%  c82ace5f-9b90-4f5c-9d86-0fbfb7ac2911  155
UN  node-73  9.36 GB    256     9.7%   c1dfab95-d476-4274-acac-cf6630375566  155
UN  node-75  9.01 GB    256     9.5%   8789d89d-2db8-4ddf-bc2d-60ba5edfd0ad  155
UN  node-74  8.91 GB    256     10.5%  581fd5bc-20d2-4528-b15d-7475eb2bf5af  155
UN  node-79  9.71 GB    256     11.0%  8e192e01-e8eb-4425-9c18-60279b9046ff  155

Removing the remaining data manually

Once the streaming is completed, the data stored in the decommissioned node can be removed manually as described in the Cassandra document:

No data is removed automatically from the node being decommissioned, so if you want to put the node back into service at a different token on the ring, it should be removed manually.

This can be done by removing the data stored in the data_file_directories, commitlog_directory, and saved_caches_directory specified in the cassandra.yaml file in the decommissioned node.