Nadav Har'El
Nadav Har'El

Reputation: 13731

Concurrent updates in DynamoDB, are there any guarantees?

In general, if I want to be sure what happens when several threads make concurrent updates to the same item in DynamoDB, I should use conditional updates (i.e.,"optimistic locking"). I know that. But I was wondering if there is any other case when I can be sure that concurrent updates to the same item survive.

For example, in Cassandra, making concurrent updates to different attributes of the same item is fine, and both updates will eventually be available to read. Is the same true in DynamoDB? Or is it possible that only one of these updates survive?

A very similar question is what happens if I add, concurrently, two different values to a set or list in the same item. Am I guaranteed that I'll eventually see both values when I read this set or list, or is it possible that one of the additions will mask out the other during some sort of DynamoDB "conflict resolution" protocol?

I see a version of my second question was already asked here in the past Are DynamoDB "set" values CDRTs?, but the answer refered to a not-very-clear FAQ entry which doesn't exist any more. What's I would most like to see as an answer to my question is an official DynamoDB documentation that says how DynamoDB handles concurrent updates when neither "conditional updates" nor "transactions" are involved, and in particular what happens in the above two examples. Absent such official documentation, does anyone have any real-world experience with such concurrent updates?

Upvotes: 18

Views: 24878

Answers (2)

Ondra Žižka
Ondra Žižka

Reputation: 46796

Your post contains quite a lot of questions.

There's a note in DynamoDB's manual:

All write requests are applied in the order in which they were received.

I assume that the clients send the requests in the order they were passed through a call.

That should resolve the question whether there are any guarantees. If you update different properties of an item in several requests updating only those properties, it should end up in an expected state (the 'sum' of the distinct changes).

If you, on the other hand, update the whole object, the last one will win.

DynamoDB has @DynamoDbVersion which you can use for optimistic locking to manage concurent writes of whole objects.

For scenarios like auctions, parallel tick counts (such as "likes"), DynamoDB offers AtomicCounters.

If you update a list, that depends on if you use the DynamoDB's list type (L), or if it is just a property and the client translates the lists into a String (S). So if you read a property, change it, and write, and do that in parallel, the result will be subject to eventual consistency - what you will read may not be the latest write. Applied to lists, and several times, you'll end up with some of the elements added, and some not (or, better said, added but then overwritten).

Upvotes: 7

Carl
Carl

Reputation: 689

I just had the same question and came across this thread. Given that there was no answer I decided to test it myself.

The answer, as far as I can observe is that as long as you are updating different attributes it will eventually succeed. It does take a little bit longer the more updates I push to the item so they appear to be written in sequence rather than in parallel.

I also tried updating a single List attribute in parallel and this expectedly fail, the resulting list once all queries had completed was broken and only had some of the entries pushed to it.

The test I ran was pretty rudimentary and I might be missing something but I believe the conclusion to be correct.

For completeness, here is the script I used, nodejs.

const aws = require('aws-sdk');
const ddb = new aws.DynamoDB.DocumentClient();

const key = process.argv[2];
const num = process.argv[3];


run().then(() => {
    console.log('Done');
});

async function run() {
    const p = [];
    for (let i = 0; i < num; i++) {
        p.push(ddb.update({
            TableName: 'concurrency-test',
            Key: {x: key},
            UpdateExpression: 'SET #k = :v',
            ExpressionAttributeValues: {
                ':v': `test-${i}`
            },
            ExpressionAttributeNames: {
                '#k': `k${i}`
            }
        }).promise());
    }

    await Promise.all(p);

    const response = await ddb.get({TableName: 'concurrency-test', Key: {x: key}}).promise();
    const item = response.Item;

    console.log('keys', Object.keys(item).length);
}

Run like so:

node index.js {key} {number}
node index.js myKey 10

Timings:

  • 10 updates: ~1.5s
  • 100 updates: ~2s
  • 1000 updates: ~10-20s (fluctuated a lot)

Worth noting is that the metrics show a lot of throttled events but these are handled internally by the nodejs sdk using exponential backoff so once the dust settled everything was written as expected.

Upvotes: 14

Related Questions