Curious
Curious

Reputation: 21540

CockroachDB read transactions

I've been reading about the read-only lock-free transactions as implemented in Google Spanner and CockroachDB. Both claim to be implemented in a lock-free manner by making use of system clocks. Before getting to the question, here is my understanding (please skip the following section if you are aware of the machineries in both systems or just in CockroachDB):

Relevant sources - https://www.cockroachlabs.com/blog/living-without-atomic-clocks/ and https://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf


Given this implementation does CockroachDB guarantee that the following two transactions will not see a violation of serializability?

  1. A user blocks another user, then posts a message that they don't want the blocked user to see as one write transaction.
  2. The blocked user loads their friends list and their posts as one read transaction.

As an example, consider that the friends list and posts lived on different shards. And the following ordering happens (assuming a max clock error of 2)

  1. The initial posts and friends list was committed at timestamp 5.
  2. A read transaction starts at timestamp 7, it reads the friends list, which it sees as being committed at timestamp 5.
  3. Then the write transaction for blocking the friend and making a post gets committed at 6.
  4. The read transaction reads the posts, which it sees as being committed at timestamp 6.

Now, the transactions violate serializability becasue the read transaction saw an old write and a newer write in the same transaction.

What am I missing?

Upvotes: 1

Views: 553

Answers (1)

Ben Darnell
Ben Darnell

Reputation: 22154

CockroachDB handles this with a mechanism called the timestamp cache (which is an unfortunate name; it's not much of a cache).

In this example, at step two when the transaction reads the friends list at timestamp 7, the shard that holds the friends list remembers that it has served a read for this data at t=7 (the timestamp requested by the reading transaction, not the last-modified timestamp of the data that exists) and it can no longer allow any writes to commit with lower timestamps.

Then in step three, when the writing transaction attempts to write and commit at t=6, this conflict is detected and the writing transaction's timestamp gets pushed to t=8 or higher. Then that transaction must refresh its reads to see if it can commit as-is at t=8. If not, an error may be returned and the transaction must be retried from the beginning.

In step four, the reading transaction completes, seeing a consistent snapshot of the data as it existed at t=7, while both parts of the writing transaction are "in the future" at t=8.

Upvotes: 2

Related Questions