How to investigate RC for sudden increase in pending compactions

Got a cassandra 2.0.17 cluster which we're currently loading with data, suddenly it seems the cluster is having issues keeping up with compaction tasks. This seem to collide with the time that each node was briefly taken offline one by one for a firmware update.

See our OpsCenter Dashboard

Wondering how to dig for a RC, hints appreciated!

Also wondering how ensure better balancing between disk[-io] usage among assign file systems.

During compaction it seems some CFs create large temp files like these:

-rw-r--r--. 1 cass cassandra     43726347 May  5 14:17 KeyspaceBlobStore-CF_Message_1-tmp-jb-22142-CompressionInfo.db
-rw-r--r--. 1 cass cassandra 340293724737 May  5 14:17 KeyspaceBlobStore-CF_Message_1-tmp-jb-22142-Data.db
-rw-r--r--. 1 cass cassandra    266403840 May  5 14:17 KeyspaceBlobStore-CF_Message_1-tmp-jb-22142-Index.db

is this effective on xfs FS or could this better be spread on more smaller files possible speeding compaction?

Eg. sample of one nodes FS usage over the past 7 days can be seen here shown the FS blob-3 has a largely increased usage mainly due to the above large temp file. Is this just because the compaction is taken too long?

TIA

Upvotes: 0

Views: 133

Answers (1)

Ben Slater
Ben Slater

Reputation: 126

It looks like you're possibly in a compaction death spiral of getting behind on compactions -> more i/o + CPU on reads -> getting further behind on compactions. Taking the nodes offline (which would then mean they are serving higher write levels to catch up when brought online) might have triggered the spiral.

It is to be expected that you have large temp files as one of the things compaction does is to take multiple smaller files and combine them to make a single larger file.

This can be a hard situation to get out of as adding nodes to the cluster may increase overall load on the cluster while they are joining. One approach that sometimes works for us is taking nodes offline (using nodetool disablegossip disablethrift and disablebinary) to allow them to catch up on compactions without serving reads and writes.

In terms of root cause, given your rapidly increasing data volumes and very high disk to node ratio (nearly 10TB per node?), I'd be looking for i/o bottlenecks - increasing CPU iowait being a good indication.

Cheers Ben

Upvotes: 1

Related Questions