Aaron Watters
Aaron Watters

Reputation: 2846

Is the CAP theorem a red herring?

I am told that I have to give up transactional guarantees in large distributed systems because the CAP theorem says I can't have it.

I think this is wrong for the following reasons:

Therefore, I can assume that for practical purposes I can have transactional behavior provided I attempt to guarantee that small partitions detect that they are disconnected and shut down or operate in some sort of degraded mode until the connection is repaired.

Corrections? Comments? Flames?


References:

Upvotes: 2

Views: 1354

Answers (2)

Erik Lucio
Erik Lucio

Reputation: 948

I recommend you read this paper: Brewer's Conjeture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services.pdf

After read it I understood two things.

First, the Brewer's conjecture is about any web service, not just about them that are distributed. Then, in this context, it makes sense that you can choose two of the three properties. For example, let's ignore partition-tolerant. From my point of view, in this case you have 2 options:

  1. Use a single machine: Then there is not network, then I have not worry about partition-tolerant, but the availability of the entire system depends on the fact that the single machine must be online.
  2. Use multiple machines, but whitout replication: Just use each machine for processing/storing data that do not have a strong functional relationship. If some machine crashes, the rest of data is available and consistent.

Maybe, you can think that there is a third option: Use multiple machines and replication and do not care about partition-tolerant. Let's suppose this! In this case, if some machine or connection among machines crash there is no manner of guaranteeing consistency or availability because the system will not have processes to recover its right data state. Here, notice that adding replication meanings to increase partition-tolerant.

Then, the second thing that understood is:

In distributed systems, where we use several machines for spreading the computation and the storage of data, the partition-tolerant is an intrisic property to them. We use a group of machines as if them were one to increase processing and storage resources and availability for clients, not CAP avalability. Therefore, a way to increase availability for clients is to support partition-tolerant: an intrisic property of distributed systems.

As a summary, the CAP theorem applied to distributed systems most say the following: Under partition-tolerant, it is impossible to guarantee consistency and availability at the same time.

Upvotes: 0

DarthVader
DarthVader

Reputation: 55032

The CAP theorem has been proven by Nancy Lynch et al. at MIT labs.

Your assumptions are not good. Yes, you can have transactions at a distributed system, but then you have to wait for all your transactions. That's when you suffer from availability. So you can have consistency and partial tolerance, but not availability.

In the other case, you can have availability and partial tolerance, but no consistency, such as MongoDB or Cassandra (with eventual consistency configured). In this case, you can have multiple DB servers, but your your data won't be available across all the servers right away. You suffer from consistency, but you gain with availability and partial tolerance.

The Last case is the easiest one: You have consistency and availability, but no partial tolerance. Think of a single database server.

In regard to your points:

  • Internet routing is amazingly reliable.

Seamlessly reliable.

  • The CAP theorem only applies to network partitions where two groups of live machines can't communicate.

The CAP theorem applies to any distributed system.

The other two points are really not making too much sense.

There are some other professors who claim that CAP is incomplete, and that there is more to it, such as latency. But the CAP theorem makes perfect sense.

There is also the "BASE" theorem (Basically Available, Soft state, and Eventual consistency). Many NoSQL databases are favoring this theorem.

Check out my blog on the CAP theorem and NoSQL.

Upvotes: 7

Related Questions