anuja Mandlecha
anuja Mandlecha

Reputation: 115

Enabling vNodes in Cassandra 1.2.8

I have a 4 node cluster and I have upgraded all the nodes from an older version to Cassandra 1.2.8. Total data present in the cluster is of size 8 GB. Now I need to enable vNodes on all the 4 nodes of cluster without any downtime. How can I do that?

Upvotes: 1

Views: 1115

Answers (3)

Nikhil
Nikhil

Reputation: 2308

In the conf/cassandra.yaml you will need to comment out the initial_token parameter, and enable the num_tokens parameter (by default 256 I believe). Do this for each node. Then you will have to restart the cassandra service on each node. And wait for the data to get redistributed throughout the cluster. 8 GB should not take too much time (provided your nodes are all in the same cluster), and read requests will still be functional, though you might see degraded performance until the redistribution of data is complete.

EDIT: Here is a potential strategy to migrate your data:

  • Decommission two nodes of the cluster. The token-space should get distributed 50-50 between the other two nodes.
  • On the two decommissioned nodes, remove the existing data, and restart the Cassandra daemon with a different cluster name and with the num_token parameters enabled.
  • Migrate the 8 GB of data from the old cluster to the new cluster. You could write a quick script in python to achieve this. Since the volume of data is small enough, this should not take too much time.
  • Once the data is migrated in the new cluster, decommission the two old nodes from the old cluster. Remove the data and restart Cassandra on them, with the new cluster name and the num_tokens parameter. They will bootstrap and data will be streamed from the two existing nodes in the new cluster. Preferably, only bootstrap one node at a time.

With these steps, you should never face a situation where your service is completely down. You will be running with reduced capacity for some time, but again since 8GB is not a large volume of data you might be able to achieve this quickly enough.

Upvotes: 1

Richard
Richard

Reputation: 11110

As Nikhil said, you need to increase num_tokens and restart each node. This can be done one at once with no down time.

However, increasing num_tokens doesn't cause any data to redistribute so you're not really using vnodes. You have to redistribute it manually via shuffle (explained in the link Lyuben posted, which often leads to problems), by decommissioning each node and bootstrapping back (which will temporarily leave your cluster extremely unbalanced with one node owning all the data), or by duplicating your hardware temporarily just like creating a new data center. The latter is the only reliable method I know of but it does require extra hardware.

Upvotes: 1

Lyuben Todorov
Lyuben Todorov

Reputation: 14173

TL;DR;

No you need to restart servers once the config has been edited

The problem is that enabling vnodes means a lot of the data is redistributed randomly (the docs say in a vein similar to the classic ‘nodetool move’

Upvotes: 0

Related Questions