onepiece
onepiece

Reputation: 3529

Dealing with read eventual consistency by retrying GetItem

I building an API #1 that creates an item in DynamoDB. I'm building another API #2 that retrieves an item using GSI (input key may not exist). But GSI reads can only be eventually consistent, and I don't want the scenario where API #1 creates an item but API #2 doesn't get that item.

So I am thinking of this:

  1. API #1 creates item via UpdateItem
  2. API #1 tries to retrieve item using GSI via GetItem. Keeps retrying with exponential backoff until it gets the item. Once this happens, eventual consistency should be over.
  3. API #2 retrieves item using same GSI as above via GetItem. Since API #1 already got the item, this should get the item on first try.

Note: I don't think API #2 can do the GetItem retries instead because its input key may not ever exist.

Would this work? Are there better solutions?

Upvotes: 3

Views: 1088

Answers (1)

Nadav Har'El
Nadav Har'El

Reputation: 13731

The property you are looking for is known in literature as monotonic read consistency - it's eventual consistency (after enough time you'll always read the new value), but additionally - when you read the new value once, further reads will not return the older value.

I couldn't find (and I tried to look hard...) any documentation guaranteeing that DynamoDB eventually-consistent reads have monotonic read consistency. Based on presentations I saw on DynamoDB's implementation (I don't have any inside knowledge), I believe that it in fact does not have monotonic read consistency:

From what I understood in those presentations, DynamoDB saves each piece of data on three nodes. One of the three nodes is the "leader" (for this piece of data) and writes go to it - and so do consistent reads. But eventually-consistent reads will go to one of the three nodes at random. So the following scenario is possible:

  1. A write is supposed to update three copies of the GSI on three nodes - X, Y and Z - but at this point only X and Y were updated, Z wasn't yet.
  2. API 1 reads from the GSI and randomly gets to ask node X and gets the new value.
  3. Now API 2 reads from the GSI. It randomly gets node Z, and gets the old value!

So it will be possible that after your application finds the new value, another read will not find it :-(

If someone else can find better documentation for this issue than just my "what I understood from presentations" I'd love to read their answer too.

Upvotes: 1

Related Questions