Reputation: 219
Recently we inserted millions of records and deleted millions of records from a table, a table of size 10 GB was truncated.
We are running with 2 nodes with SizeTieredCompactionStrategy, currently CPU utilization is 100% and pending compaction is increasing , currently pending compaction is 293144
Any pointers to reduce CPU utilization and get the compaction done quickly.
Upvotes: 1
Views: 1952
Reputation: 7305
reduce CPU utilization and get the compaction done quickly.
These two things are orthogonal. You can either accelerate the compaction (by using more resources) or limit the resources for the compactions so that your writes aren't affected but have it take longer.
If you have an ingest running against your cassandra cluster, I would try to ensure that it is not affected by your compactions. As long as the # of pending compactions is decreasing over time it's just a matter of time.
If you don't have reads or writes coming in (I.E. downtime or you're bootstrapping) it's okay to let compactions use up all your resources and finish fast.
The levers are:
1) get/set compaction throughput (nodetool)-- only kicks in for the next available compaction. This is how fast the compaction will occur. Default is 16 mb/s if you have resources available, you can increase this to a larger number.
2) concurrent compactors -- there are 2 values you have to set in JMX. you can do this on the fly using jmxsh or jconsole, etc. This is the number of compactions you can run at a time (number of cores).
Watch nodetool compactionstats
or OpsCenter (you can also chart pending compactions and set alerts) to find out the progress for the current compactions or nodetool comactionhistory
for completed compactions.
a table of size 10 GB was truncated.
Truncates are free, no compaction needed.
Upvotes: 6