Reputation: 6811
Let’s assume that the total disk usage of all keyspaces is 100GB before replication. The replication factor is 3. Making the total physical disk usage = 100GB x 3 = 300GB.
We use the default compaction strategy (size-tiered) and let’s assume the worse case where Cassandra needs as much free space as the the data to complete the compaction. Does Cassandra needs 100GB (before replication) or 300GB (100GB x3 with replication)?
In other words, when Cassandra needs free disk space for performing compaction, does the replication factor has any influence?
Upvotes: 1
Views: 116
Reputation: 2288
Compaction in Cassandra is local to a Node.
Now let's say you have a 3 node cluster, replication factor is also 3, and the original data size is 100GB. This means that each node has 100GB worth of data.
Hence on each node, I will need 100GB free space to compact the data present on that node.
TLDR; Free space required for Compaction is equal to the total data present on the node.
Upvotes: 3
Reputation: 87349
Because data is replicated between the nodes, every node will need to have up to 100Gb free space - so it's total of the 300Gb, but not on one node...
Upvotes: 2