How to investigate RC for sudden increase in pending compactions

Question

Got a cassandra 2.0.17 cluster which we're currently loading with data, suddenly it seems the cluster is having issues keeping up with compaction tasks. This seem to collide with the time that each node was briefly taken offline one by one for a firmware update.

See our OpsCenter Dashboard

Wondering how to dig for a RC, hints appreciated!

Also wondering how ensure better balancing between disk[-io] usage among assign file systems.

During compaction it seems some CFs create large temp files like these:

-rw-r--r--. 1 cass cassandra     43726347 May  5 14:17 KeyspaceBlobStore-CF_Message_1-tmp-jb-22142-CompressionInfo.db
-rw-r--r--. 1 cass cassandra 340293724737 May  5 14:17 KeyspaceBlobStore-CF_Message_1-tmp-jb-22142-Data.db
-rw-r--r--. 1 cass cassandra    266403840 May  5 14:17 KeyspaceBlobStore-CF_Message_1-tmp-jb-22142-Index.db

is this effective on xfs FS or could this better be spread on more smaller files possible speeding compaction?

Eg. sample of one nodes FS usage over the past 7 days can be seen here shown the FS blob-3 has a largely increased usage mainly due to the above large temp file. Is this just because the compaction is taken too long?

TIA

How to investigate RC for sudden increase in pending compactions

Answers (1)

Related Questions