Akhil K Kamal
Akhil K Kamal

Reputation: 100

Atomic Batch in Cassandra

Consider the following batch statement in Cassandra:

 BEGIN BATCH
 INSERT INTO users (userID, password, name) VALUES ('user2', 'ch@ngem3b', 'second user')

 UPDATE users SET password = 'ps22dhds' WHERE userID = 'user2'

 DELETE * FROM users WHERE userID = 'user2'
 INSERT INTO users (userID, password, name) VALUES ('user2', 'ch@ngem3c', 'Andrew')
 APPLY BATCH;

Will the above statements in Cassandra batch ensures row-level isolation (userID is the row key) as the row key is the same?

Upvotes: 2

Views: 351

Answers (2)

RussS
RussS

Reputation: 16576

One important thing to note is that within a batch without timestamps specified per statement all of the statements will be executed at the same timestamp.

This means all four statements you wrote

INSERT INTO users (userID, password, name) VALUES ('user2', 'ch@ngem3b', 'second user')

UPDATE users SET password = 'ps22dhds' WHERE userID = 'user2'

DELETE * FROM users WHERE userID = 'user2'

INSERT INTO users (userID, password, name) VALUES ('user2', 'ch@ngem3c', 'Andrew')

All happen at the same time, in this case the highest value for the commonly modified cell is used. While all four statements are applied the outcome is most likely not what you are expecting.

Basically C* will see

INSERT ('user2', 'ch@ngem3b', 'second user')
INSERT ('user2', 'ps22dhds', 'second user')
INSERT ('user2', 'Tombstone', 'Tombstone')
INSERT ('user2', 'ch@ngem3c', 'Andrew')

In this case since they all have the same timestamp C* resolves the conflict by choosing the largest value for the cells and you will end up with, (unless I got the byte ordering wrong here)

('user2', 'ps22dhds', 'second user')

Instead for this kind of operation consider using the Check And Set (CAS) operations in C*.

http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0

Upvotes: 2

Stefan Podkowinski
Stefan Podkowinski

Reputation: 5249

Starting with Cassandra 2.0.6, all batch statements for a single partition will be executed as a single update operation. This would involve row-level isolation.

Upvotes: 1

Related Questions