HelpfulPanda
HelpfulPanda

Reputation: 377

How to recover a Cassandra node by streaming from a seed node?

An Apache Cassandra node is running in a 3 node cluster with replication factor of 3. All configurations are correct. Cassandra version is 2.1.8.

Let's assume that the data is logically damaged beyond repair, meaning that it's not recoverable by the usual tools (scrub / repair).

The node is down in normal state.

Considering a scenario of node recovery by streaming from a seed node which is registered in cassandra.yaml (and not replacing the node with another one):

  1. What happens if I delete the top level data directory on that node, including commitlog, data, hints and saved_caches directories and start the service? Would the node resume gracefully from that point and I could just run nodetool repair to get the oldest data into the node?

  2. If instead I start the service and run nodetool rebuild would it be appropriate and enough to fix the problem?

  3. If none of the above are best practices, would it be a solution to decommission the node and make it join the cluster again?

Upvotes: 1

Views: 675

Answers (1)

Aaron
Aaron

Reputation: 57748

You'll have better luck decommissioning the node, wiping it (data, commitlog, & saved_caches dirs), specifying it IP as a replacement address in cassandra-env.sh, and rejoining it to the cluster.

nodetool rebuild is useful when you have multiple data centers, and you want to direct the streams to come from a specific DC.

nodetool repair will technically work (answering a "yes" to #1), but you'll spend a lot of time waiting on Merkle tree calculation. Repairs are good to be doing weekly, and are great for fixing minor consistency discrepancies. But after a certain point, comparing to discover the differences (Merkle trees) and repairing them becomes slower than simply doing a decom/rejoin of the node.

Upvotes: 2

Related Questions