Reputation: 324
We are seeing lot of hints timing out and I don't see any thing is logs about nodes are going DOWN. This is strange to me why cassandra is building up the hints table if it does not think it is down. I don't see any GC pauses as well.
Can someone help me how to solve this problem
INFO [HintedHandoff:2] 2015-03-11 01:56:00,958 HintedHandOffManager.java (line 469) Timed out replaying hints to /1.1.1.79; aborting (0 delivered)
INFO [HintedHandoff:1] 2015-03-11 02:03:54,914 HintedHandOffManager.java (line 469) Timed out replaying hints to /1.1.1.76; aborting (0 delivered)
Upvotes: 2
Views: 2270
Reputation: 284
If you want to somehow replicate that behaviour just unplug for 5 seconds, each 10 seconds, 10 times in a row the internet cable from a machine.
Here i have some extras from another machine`s /var/log/cassandra/system.log
INFO [HintedHandoff:2] 2016-10-27 14:20:00,333 HintedHandOffManager.java:486 - Timed out replaying hints to /192.168.0.178; aborting (0 delivered)
INFO [HintedHandoff:1] 2016-10-27 14:26:13,393 HintedHandOffManager.java:367 - Started hinted handoff for host: fa16996c-722c-458b-a621-eb53efa79fb2 with IP: /192.168.0.178
INFO [HintedHandoff:1] 2016-10-27 14:28:27,959 HintedHandOffManager.java:486 - Timed out replaying hints to /192.168.0.178; aborting (28850 delivered)
INFO [HintedHandoff:2] 2016-10-27 14:36:17,398 HintedHandOffManager.java:367 - Started hinted handoff for host: fa16996c-722c-458b-a621-eb53efa79fb2 with IP: /192.168.0.178
I understand that sometimes it timeouts before the actual stream starts
aborting (0 delivered)
Sometimes it aborts after the stream started, specifying how many were sent and set something like a marker to know from where to stream next time :
aborting (28850 delivered)
Upvotes: 0
Reputation: 1931
The fact that you have hints on that node indicates that the node itself is up. What this log say is that nodes 1.1.1.79 & 1.1.1.76 are down, or more likely, flapping. You should check for their statuses. Run nodetool tpstats on these nodes, if they are up, look for any dropped mutations. Inspect the logs.
Upvotes: 2