Reputation: 960
We have a 3 node Cassandra cluster with RF = 2. The read and write consistency is set to ONE. We are using Vnodes as well. Lets label these nodes as N1, N2 and N3. Let's say that N3 goes down. I was under the impression that whenever a node goes down, the other nodes would store hints and whenever N3 comes up, the hints would be sent to N3, thereby ensuring that the data is consistent across replicas. However, as I was going through the docs, I came across the parameter max_hint_window_in_ms
which defaults to 3 hours. So, if a node is dead for more than 3 hours, it is considered permanently dead and no hints are stored. So far, so good.
So, my understanding now is that if a node is down for say 10 hours, then the hints for the first 3 hours would be transferred to this node when it comes back up, but the writes for this 7 hour duration would be lost for this node. Moreover, if a read query is fired for a particular token range, and since this node is also elligible to serve the read requests for a token range, it would return null instead of the actual data that is stored in some other node. Is my understanding correct? What, then, should be done?
Upvotes: 2
Views: 398
Reputation: 57748
What, then, should be done?
The docs state that when you bring the down node (N3) back, that you would then have to run a repair on it.
Honestly though, in most of our clusters, I find it easier to simply remove the node (while it is down) and then re-bootstrap it into the cluster. That usually goes faster than computation of Merkle trees and streaming of repair data. But if you don't have a lot of data per node (say less than 20GB), running a repair shouldn't be too painful.
Upvotes: 2