user8729669
user8729669

Reputation: 31

Consistency over availability in CAP Theorem

In the wikipedia article on CAP Theorem(https://en.wikipedia.org/wiki/CAP_theorem), it states (bold emphasis mine) "When choosing consistency over availability, the system will return an error or a time-out if particular information cannot be guaranteed to be up to date due to network partitioning."

If so, doesn't choosing consistency over availability mean we lose partition tolerance as well? The system might be up but if it is returning errors for all my data access, what good is it? Or, does "network partitioning" imply data partitioning as well here? In other words, if data partitioning is also implied, atleast some parts of the data are known to be up-to-date and can be returned while still satisfying the consistency requirement.

Upvotes: 3

Views: 1945

Answers (1)

matino
matino

Reputation: 17725

Assume you have 2 datacenters, each having a separate database and your system allows clients to connect to either 1st or 2nd datacenter. Both datacenters must be in sync, so there is a network link between them.

Now imagine that network link goes down and databases can't communicate between each other anymore (this is what a network partition means). What do you do now as an application developer?

You basically have 2 options:

1) Make the system available, which by CAP definition means:

Every request received by a non-failing [database] node in the system must result in a [non-error] response

Note that in our example both nodes are non-failing (they are up and running).

In other words you can allow all clients from both datacenters to write and read data but you loose consistency (see below for the definition), since writes in 1 database won't be visible in another database.

2) Make the system consistent (note that it has nothing to do with ACID consistency), which by CAP definition means linearizability, which in simple words means that if a write happend, it must by seen by the whole system (neither node must see the previous state).

In our case it meens you need to reject reads and writes from one of the datacenters, so only one datacenter becomes operational. Such system is not useless at all and you don't loose partition tolerance, since you can rereoute all your clients to the operational database.

There is a lot of confusion around the CAP theorem and I recommend that you read an excellent blog post by Martin Kleppmann that helped me understand a lot about the subject: https://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html

Upvotes: 4

Related Questions