Apache Cassandra decommission second DC and join nodes into first DC as brand new nodes?

Question

My Cassandra cluster consists of 2 DCs, each DC has 5 nodes and replication factor per DC is 3. Both DCs are hosted onto the same docker orchestrator. This is a legacy and probably it was done during last major system migration years ago. At the time being I don't see any advantage of having 2 DCs with same replication factor 3. This way same data is written 6 times. Cluster is at least 80% write heavy, reads are more or less limited.

Cassandra load is struggling at peak times, so I would like to have 1 DC with 10 nodes (instead of 2DCs x 5 ndoes) to be able to balance across 10 nodes, instead of just 5. This way I will also bring down data size per node. Having same amount of RAM and CPU dedicated to Cassandra, I would win performance and free storage space ;-)

So idea is to decommission DC2 and bring all 5 nodes from it to DC1 as brand new nodes. Steps are known:

alter keyspaces to be limited to DC1 only.
no clients to be writing/reading to/from DC2 - DCAwarePolicy with LOCAL_*
I wonder about next step - it says I need to start decommissioning node by node DC2. Is this mandatory or I could somehow just take those nodes down? Goal is not to decommission some, but all nodes in a DC. If I decommission say node5, data would be transferred to remaining 4 nodes and so on. At some point I would be left with 3 nodes and replication factor 3, so I won't be able to decommission any further. What is more - I guess there would be no free space on those node volumes and I am not willing to extend those any further.

So my questions are:

is there a way to alter keyspace to DC1 only, then just to bring all DC2 nodes down, erase volumes and add them one by one to DC1, expanding DC1? Basically to decommission all DC2 nodes at once?
Is there a way for even quicker move of those 5 DC2 nodes to DC1 (at the end they contain same data as 5 nodes in DC1)? Like just join them to DC1 with all data they contain?
What is the advantage of having 2 DCs in a single cluster, instead of having a single DC with double the nodes? Or it strongly depends on the usage and the way services write and read data from Cassandra?

Appreciate your replies, thanks.

Cheers, OvivO

Aaron · Accepted Answer

is there a way to alter keyspace to DC1 only, then just to bring all DC2 nodes down, erase volumes and add them one by one to DC1, expanding DC1? Basically to decommission all DC2 nodes at once?

Yes, you can adjust the keyspace definition to just replicate within DC1. Since you're basically removing a DC, you could shut them all down, and run a nodetool removenode for each. In theory, that would remove the nodes from gossip and (if they're down) not attempt to move data around. Then yes, add each node back to DC1, one at a time. Once you're done, run a repair, followed by a nodetool cleanup on each node.

Is there a way for even quicker move of those 5 DC2 nodes to DC1 (at the end they contain same data as 5 nodes in DC1)? Like just join them to DC1 with all data they contain?

No. Token range assignment is DC dependent. If they moved to a new DC, their range assingments would change, and the nodes would very likely be responsible for different ranges of data.

What is the advantage of having 2 DCs in a single cluster, instead of having a single DC with double the nodes?

Geographic awareness. If you have a mobile app and users on both the West Coast and East Coast, you don't want your East Coast users making a call for data all the way to the West Coast. You want that data call to happen as locally as possible. So, you'd build up a DC on each coast, and let Cassandra keep them in-sync.

Apache Cassandra decommission second DC and join nodes into first DC as brand new nodes?

Answers (1)

Related Questions