Vishal Sharma
Vishal Sharma

Reputation: 1750

Strong Consistency in Cassandra

According to datastax article, strong consistency can be guaranteed if, R + W > N where R is the consistency level of read operations W is the consistency level of write operations N is the number of replicas

What does strong consistency mean here? Does it mean that 'every time' a query's response is given from the database, the response will 'always' be the last updated value? If conditions of strong consistency is maintained in cassandra, then, are there no scenarios where the data returned might be inconsistent? In short, does strong consistency mean 100% consistency?

Edit 1

Adding some additional material regarding some scenarios where Cassandra might not be consistent even when R+W>RF

  1. Write fails with Quorum CL
  2. Cassandra's eventual consistency

Upvotes: 9

Views: 9880

Answers (5)

Stan
Stan

Reputation: 862

I will actually regard this Strong Consistency as Strong read consistency. And it is sessional, aka Monotonic Read Consistency.( refer to @NadavHar'El answer).

But it is not sequential consistency as Cassandra doesn't fully support lock, transaction or serialze the write operation. There is only lightweight transaction, which supports local serialization of write operation and serialization of read operation.

To make things easy to understand. Let's say we have three nodes - A, B, C and set read quorum to be 3 and write to be 1.

If there is only one client, it writes to any node - A.
B and C might be not synchronized.(Eventually they will -- Eventual consistency)

But When the client reads again, it requires client to get at least three nodes' response and by comparing the latest timestamp, we will use A's record. This is Monotonic Read Consistency

However,if there are two client trying to update the records at the same time or if they try to read the value first and then rewrite it(e.g increase column by 100) at the same time: Client C1 and Client C2 both read the current column value as 10, and they both decide to increase it by 100: While C1 just need to write 110 to one node, client C2 will do the same and the final result on any node can only be 110 max.

Then we lose 100 in these operations(Lost updates). It is the issue caused by race condition or concurrent issues. It has to be fixed by serializing the operation and using any form of lock just like how other SQL DB implements transaction.

I know Cassandra now has new counter column which might solve it but it is still limited in terms of the full transaction. And Cassandra is also not supposed to be transactional as it is NoSQL database which sacrifice consistency for availability

Upvotes: 1

Prakhar Agrawal
Prakhar Agrawal

Reputation: 1022

While this is an old question, I thought I would chip in to set the record straight.

R+W>RF does not imply strong consistency

A system with **R+W>RF* will only be eventually consistent. The claims for strong consistency guarentee break during node failures or in between writes. For example consider the following scenario:

Assume that there are 3 nodes A,B,C with RF=3, W=3, R=2 (hence, R+W = 5 > 3 = RF)

Further assume key k is associated to value v i.e. (k,v) is stored on the database. Suppose the following series of actions occur:

  • t=1: (k,v1) write request is sent to A,B,C from a user
  • t=2: (k,v1) reaches A and is written to store at A
  • t=3: Reader 1 sends a read request for key k, which is replied to by A and B
  • t=4: Reader 1 receives response (k,v1) - by latest write wins rule
  • t=5: Reader 1 sends another read request which gets served by nodes B and C
  • t=6: Reader 1 receives response (k,v), which is an older value INCONSISTENCY
  • t=7: (k,v1) reaches C and is written to store at C
  • t=8: (k,v1) reaches B and is written to store at B

This demonstrates that W+R>RF cannot guarantee strong consistency. To ensure strong consistency you might want to use another algorithm such as paxos or raft that can help in ensuring that the writes are atomic. You can read an interesting article on the same here (Do checkout the FAQ section)


Edit:

Cassandra does have some internal mechanism (called the blocking read repairs) - that trigger synchronous writes before response from the db is sent back to client. This kind of synchronous read repair occurs in case of inconsistencies amongst the nodes queried to achieve read consistency level and ensures something known as Monotonic Read Consistency [See below for definitions]. This causes the (k,v1) in above example to be written to node B before response is returned in case of first read request and so the second read request would also have an updated value. (Thanks to @Nadav Har'El for pointing this out)

However, this still does not guarantee strong consistency. Below are some definitions to clear it of:

Sequential/Strong Consistency: the result of any execution is the same as if the reads and writes occur in some order, and the operations of each individual processor appear in this sequence in the order specified by its program [as defined by Leslie Lamport]

Monotonic Read Consistency: once you read a value, all subsequent reads will return this value or a newer version

Sequential consistency would require the client program/reader to see the latest value that was written since the write statement is executed before the read statement in the sequence of program instructions.

Upvotes: 3

Mandraenke
Mandraenke

Reputation: 3266

Cassandra has tunable consistency with some tradeoffs you can choose.

R + W > N - this simply means there must be one overlapping node in your roundtrip that has the actual and newest data available to be consistent.

For example if you write at CL.ONE you will need to read at CL.ALL to be sure to get a consistent result: N+1 > N - but you might not want CL.ALL as you can not tolerate a single node failure in your cluster.

Often you can choose CL.QUORUM at read and write time to ensure consistency and tolerate node failures. For example at RF=3 a QUORUM needs (3/2)+1=2 nodes available, so R+W>N will be 4>3 - your requests are consistent AND you can tolerate a single node failure.

One thing to keep in mind - it is really important to have thight synchronized clocks on all your nodes (cassandra and application), you will want to have ntp up and running.

Upvotes: 4

Horia
Horia

Reputation: 2982

For both reads and writes, the consistency levels of ANY , ONE , TWO , and THREE are considered weak, whereas QUORUM and ALL are considered strong.

Upvotes: 2

undefined_variable
undefined_variable

Reputation: 6218

Yes. If R + W consistency is greater than replicas then you will always get consistent data. 100% consistency. But you will have to trade availability to achieve higher consistency.

Cassandra has concept of tunable consistency (set consistency on query basis).

Upvotes: 1

Related Questions