user1947415
user1947415

Reputation: 983

Cassandra concurrent writes

How does Cassandra guarantee eventual consistency when concurrent writes happen?

For example, Client A writes to tableA.rowA.colA, while at the same time Client B writes to tableA.rowA.colA.

The coordinator nodes distribute the request to the replica nodes say NodeA NodeB and NodeC.

On NodeA, ClientA request arrives first. On NodeB, ClientB request arrives first. Then, will it be forever inconsistent?

Upvotes: 4

Views: 5641

Answers (2)

Ashraful Islam
Ashraful Islam

Reputation: 12840

Every write (insert/update/delete) to cassandra, a timestamp associated with each column is also inserted. when you execute read query, timestamps are used to pick a "winning" update within a single column or collection element

What if I have a truly concurrent write with the same timestamp? In the unlikely case that you precisely end up with two time stamps that match in its microsecond, you might end up with a bad version but Cassandra ensures that ties are consistently broken by comparing the byte values

So for your case "On NodeA the ClientA request arrive first. On NodeB the ClientB request arrive first"

  • If ClientA request timestamp is the older then ClientA will win

  • If ClientB request timestamp is older then ClientB will win.

  • If ClientA and ClientB both have the same timestamp then winner is choosen by comparing the values lexically by bytes, so that the value returned is deterministic

Upvotes: 1

RussS
RussS

Reputation: 16576

Cassandra follows a "Last Write Wins" policy. The timestamp used can be set manually but by default is set client side by the requester see Datastax Java Driver docs. The order in which writes arrive is irrelevant. If write A has an earlier timestamp than write B then it will always be overwritten by write B. The only ambiguous case is when the timestamps match exactly. In that case the greater value wins.

The eventually consistent portion of this is:

  • Assuming A has an earlier timestamp than B
  • If A arrives on Replica 1 and B arrives on Replica 2, the correct state is B
  • Replica 1 will respond A until it receives the information about B from Replica 2
  • When B is replicated the Replica 1 will respond B as well.

Most use-cases involve not storing state in Cassandra so these sorts of issues do not arise.

Upvotes: 9

Related Questions