Why would the coordinator store hints about dead replicas if a replica node for the row is known to be down ahead of time?

Question

According to online docs on datastax (https://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_about_hh_c.html), the following is stated:

During a write operation, when hinted handoff is enabled and consistency can be met, the coordinator stores a hint about dead replicas in the local system.hints table under either of these conditions:

A replica node for the row is known to be down ahead of time.
A replica node does not respond to the write request.

What I'm confused is why is the first bullet point a condition why the coordinator would store hints if they already know ahead of time it's down.

According to the docs here(https://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure), it states that: the only time Cassandra will fail a write is when too few replicas are alive when the coordinator receives the request.

From what I've read so far, hints are utilized only when the required replicas are alive at the time of receiving the request and one or more of the replicas become unresponsive. The first bullet point says that hints are used when a replica node is down already. If Cassandra will just automatically fail a write when too few replicas are alive, what's the point of storing hints for a write that has already been deemed a failure?

Simon Fontana Oscarsson · Accepted Answer

In Cassandra there is a thing called consistency level (CL) that is set for every request. The CL defines how many nodes have to respond to a request for it to be successful.

For example, say you have replication of 3. You don't necessarily need all 3 nodes in order to know that you get a correct response. The majority (called quorum) is enough. Don't misinterpret this though. If the request is a write, you still write to all 3 replicas since you want data to be replicated on all nodes. But since you specified CL quorum only the majority, 2, nodes have to respond back to the coordinator for the client request to be a success.

So the full write path for a client request with CL quorum could be the following:

                                      --write--> Node 1
Client --write request--> Coordinator --write--> Node 2
                                      --write--> Node 3

                                <--success-- Node 1
Client <--success-- Coordinator <--success-- Node 2
                                <--success-- Node 3

This is in case all 3 nodes respond with a success to the coordinator.

We could also get a timeout from one of the nodes:

                                      --write--> Node 1
Client --write request--> Coordinator --write--> Node 2
                                      --write--> Node 3

                                <--success-- Node 1
Client <--success-- Coordinator <--success-- Node 2
                                <--timeout-- Node 3

Since we only need success from a majority of nodes the request will still be successful. In this case a hint will be stored on the coordinator. It will try to replay (send the write again) once Node 3 is responsive again.

In the case that the node is already known to be down or unresponsive the same thing will happen. The coordinator simply cares if enough nodes responds in order to satisfy the consistency level.

Why would the coordinator store hints about dead replicas if a replica node for the row is known to be down ahead of time?

Answers (2)

Related Questions