J. Doe
J. Doe

Reputation: 13063

EACH_QUORUM VS QUORUM

This is a screenshot from the consistency level table according to Datastax documentation:

enter image description here

What is the difference between EACH_QUORUM and QUORUM? Each and all DC's are the same AFAIK. In the QUORUM row the following is stated:

Some level of failure is possible

Why? If one node is down in each DC? The same applies for EACH_QUORUM right? Why does EACH_QUORUM does not have some level of failure, since it is ALL_QUORUM and not ALL?

Both levels have the same in common (AFAIK):

Upvotes: 4

Views: 4187

Answers (3)

Jim Wartnick
Jim Wartnick

Reputation: 2196

The difference between QUORUM and EACH_QUORUM is as follows.

Assume you have 6 nodes in your cluster - 2 DCs with 3 nodes each and RF=3 for both DCs (all nodes have all data).

The QUORUM and EACH_QUORUM value is the same = 4 (6/2 + 1). However, which nodes can respond varies slightly. EACH_QUORUM has less combinations of what will satisfy the condition.

QUORUM requires 4 nodes to respond but with any combination of nodes. So for example, maybe 3 nodes from the local DC and 1 node from the remote DC respond. That's perfectly fine.

Now, with EACH_QUORUM, each DC must have a quorum respond. What the means is that 2 nodes from each DC must respond in this case, that's it (which 2 nodes in each DC is irrelevant) . 3 nodes from the local DC and 1 node from a remote DC does not qualify as 1 node in the remote dc is not a quorum of that dc.

Let's change the cluster node count to 7 instead of 6. DC1 has 4 nodes, DC2 has 3 nodes. DC1 RF = 4 and DC2 RF = 3 (all nodes have the data again). Here's where the fun begins with the odd number total in the RF.

While I'm not sure about the word "failure", but I can see certain scenarios where this could be problematic.

For QUORUM, 4 nodes need to respond (7/2 + 1 = 4) - any 4 nodes - including the scenario when all nodes from the local/larger DC responds (DC1 in this case). What if the most current data is on DC2? In this scenario, you could end up with undesirable results.

With EACH_QUORUM, 5 nodes would need to respond (Quorum of DC1 = 4/2+1 = 3, Quorum of DC2 = 3/2+1 = 2 ==> total = 5). With this scenario, you're forcing Cassandra to return data from both DCs - and a QUORUM level from each DC which should give you good results.

Again, I'm trying in my head to determine where the additional "failures" could come with QUORUM v.s. EACH_QUORUM and I can't at the top of my head see it. It would seem if anything, EACH_QUORUM with an odd node count, is less flexible in unavailable nodes as a quorum in each DC must respond v.s. any quorum number of nodes from any DC. I can see where QUORUM may give you undesirable results though (explained above).

Upvotes: 4

LetsNoSQL
LetsNoSQL

Reputation: 1538

Difference between EACH_QUORUM and QUORUM like below:-

i.e We have 2 DCs with 5 nodes each. Total 10 nodes in the Cassandra cluster and replication factor is 2 on each DC1 and 3 on DC2.

So the formula is replication factor/2 +1. QUORUM needs here 5(2+3)/2 +1= 3 nodes(2 nodes can respond the request for DC1 but 3 nodes can respond the request for DC2) to acknowledge the coordinator node.

EACH_QUORUM needs here 2/2 +1 = 2 and 3/2 +1 =2 nodes must acknowledge on each DC to the coordinator node.

https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/dml/dmlConfigConsistency.html

Some level of failure is possible it can come when sometimes some of the node of DC1 will not respond then data will not replicate and failure can be observed but other DC2 will fulfill the request.

Upvotes: 0

Carlos Monroy Nieblas
Carlos Monroy Nieblas

Reputation: 2283

One thing to consider is that QUORUM is related to the Replication Factor(RF) and this will determine the number of nodes that can be offline for each datacenter and allowing to complete the transaction. This means that If one node is down in each DC, this doesn't necessarily will cause inconsistency or failed queries.

For this, use the formula:

NodesNeededForQuorum = ReplicationFactor / 2 + 1

remember to round down the result.

It may be easier to demonstrate the difference with the following scenario: Let's assume that you have 2 DC's with RF of 3 in each datacenter; if you use QUORUM it will require at least 4 nodes from any DC are able to process the query, it could be 2 from each DC, 3 from DC1 and 1 from DC2, or 1 from DC1 and 3 from DC2. With EACH_QUORUM it will also need that 4 nodes are able to answer, but they should be only 2 from each DC.

If you have 3 DC's with RF of 3, QUORUM will be fulfilled with 5 nodes from any DC, while EACH_QUORUM will require 6 nodes (2 from each DC).

Things can be more complex if the RF is different between DC's, and that will depend on the cluster design.

When using EACH_QUORUM please consider the latency while communicating within different DC's, if the network communication is slow, or they are located in distant geographical locations there could be timeouts of the queries.

Upvotes: 1

Related Questions