Reputation: 23
after upgrade our cluster(4DC, ubuntu 14.04 x64, cpp-driver 2.0.1 as client in our app) from 4.6 to 4.7, got message in logs on few nodes with small load "MessagingService.java:888 - 1 MUTATION messages dropped in last 5000ms" with 1 Pending HintedHandoff notice in thread pool dump
what i try:
run "nodetool truncatehints" on each running node in cluster
changing openjdk to oracle jdk(1.7.0_76-b13)
decommission node and rejoin it
how to find this mutation/hint and drop it?
side note:
we do not increase load ( version 4.6 work ok with this load)
we do not decrease node count
we have ssd backed storage
fixed in https://issues.apache.org/jira/browse/CASSANDRA-9129
Upvotes: 1
Views: 4610
Reputation: 7305
Dropped mutations usually mean that your disk is not able to keep up with your ingest. You may be interested, at this point, to find out if there are any threadpools backing up (usually flushwriters if this is an IO issue). This is why cassandra will log the treadpool status at that moment.
Cassandra is built on a SEDA architecture with multiple thread pools that can handle up to a certain number of parallel tasks. Pending threadpool tasks pile up when there are more active tasks than the pool can concurrently handle. They will eventually get processed once the system has resources to do so, or dropped under extreme circumstances.
To see the current status of your thread pools use nodetool tpstats
. Most likely your hints task has already been processed.
The fact that you were accumulating hints implies that some of your nodes were down and hints are being replayed for consistency now that the node has come back up.
Your core issue is the dropped mutations. Consider one of the following actions if you continue to see this:
Upvotes: 3