Reputation: 43
Let's say we have a Cassandra cluster of 6 nodes and the RF=3. Thus, if we query to extract data from a particular node and while processing or transferring data the node fails. What are the possible outcomes for the following scenario?
Lets say its processing the required data from the disk and the node dies in the process, will the coordinator(the node that received our request) resend the request to one of the replicated nodes or just return an error to client?
Let's say the node died while it was transferring data. So will the coordinator return partial data? or will the coordinator realise that the information is incomplete and re-send the request to a different node(a replica)?
In either of the cases, as a programmer do we have to explicitly code any conditions to tell the Cassandra sever or is it all taken care internal?
Thanks in advance.
P.S: I am sorry if a similar question has been asked before. I did try searching but I couldn't find it.
Upvotes: 2
Views: 735
Reputation: 13731
One of the most important concepts to understand in Cassandra is its variable "Consistency Level", or CL. Perhaps the most common setting is CL=QUORUM, which means that with RF=3 (each piece of data is replicated on 3 nodes), Cassandra will require two successful responses from two replicas before returning a result to the client.
In a request for a particular partition, the coordinator will start out by sending out the client's requests to 2 of the 3 replicas known to hold the partition. Cassandra keeps an estimate of average response latency, and when this estimate has passed, it sends a third request to the third replica. Such a timeout will happen in the cases you mentioned - if the response doesn't complete quickly (it doesn't matter if it partially completed), the third request is sent. Unless two nodes are down at the same time, you will get your complete response and the client doesn't need to take care of anything. This is the "high availability" feature that Cassandra and other NoSQL databases are famous for.
Note that this answer is true even for extremely long responses (scanning the entire table, or fetching a very long partition). Such long responses are broken up to "pages" of reasonable lengths, each page is fetched in a separate request, and can come from 2 of the 3 replicas, not necessarily the same one.
Everything I wrote above also applies to Scylla, as well as Cassandra.
Upvotes: 4