Mani
Mani

Reputation: 1

Cassandra cluster - Store equal data among the nodes

In Cassandra Cluster, how can we ensure all nodes are having almost equal data, instead one node has more data, another has very less.

If this scenario occurs, what are the best practices

Thanks

Upvotes: 0

Views: 524

Answers (2)

xmas79
xmas79

Reputation: 5180

Unless you are using ByteOrderedPartitioner for your cluster that should not happen. See DataStax documentation here for more information about available partitioners and why it should not (normally) happen.

Upvotes: 0

Erick Ramirez
Erick Ramirez

Reputation: 16403

It is ok to expect a slight variation of 5-10%. The most common causes are the distribution of your partitions may not be truly random (more partitions on some nodes) and there may be a large variation in the size of the partitions (smallest partition is a few kilobytes but largest partition is 2GB).

There are also 2 other possible scenarios to consider.

SINGLE-TOKEN CLUSTER

If the tokens are not correctly calculated, some nodes may have a larger token range compared to others. Use the token generation tool to get a list of tokens that is correctly distributed around the ring.

If the cluster is deployed with DataStax Enterprise, the easiest way is to rebalance your cluster with OpsCenter.

VNODES CLUSTER

Confirm that you have allocated the same number of tokens in cassandra.yaml with the num_tokens directive.

Upvotes: 0

Related Questions