Benjamin Hobson
Benjamin Hobson

Reputation: 104

DynamoDB unique primary key ConditionExpression

I am implementing a single table design in dynamodb and overloading keys. My current design allows for an email to be subscribed to a thread.

This is a noSQL workbench screenshot:

enter image description here

I am using EM#\<email> as the partition key and SB#\<thread id> as the sort key. I am constructing a putItemCommand from nodeJS lambda env. Basically, the command works as expected.

Here is the payload:

new PutItemCommand({
                    TableName: 'sometable',
                    Item: {
                        pk: {
                            S: asEmail(email), //Resolves to `EM${email}`
                        },
                        sk: {
                            S: asSubscription(chain), //Resolves to `SB${chain}`
                        },
                    },
                    ConditionExpression: 'attribute_not_exists(pk)',
                }),

Now I am just confused why this works. I am trying to ensure that the primary key (pk,sk) is unique so an email cannot be subscribed twice to a thread. But I am confused why

ConditionExpression: 'attribute_not_exists(pk)',

correctly accomplishes this. Reading this condition expression makes me believe that it is checking to make sure there is no partition key that matches. Is 'pk' an alias or does this have something to do with how dynamo retrieves data? I just need someone to spell this out for me.

Upvotes: 3

Views: 2353

Answers (2)

yshavit
yshavit

Reputation: 43436

From AWS's documentation on condition expressions:

The following example uses attribute_not_exists() to check whether the primary key exists in the table before attempting the write operation.

Note

If your primary key consists of both a partition key(pk) and a sort key(sk), the parameter will check whether attribute_not_exists(pk) AND attribute_not_exists(sk) evaluate to true or false before attempting the write operation.

...

--condition-expression "attribute_not_exists(Id)"

Note that if your table has a primary key and sort key, then both are required for each item, and they uniquely identify each item. That means that you can't have a duplicate (pk,sk) by definition. If you try to put a new object with the same (pk,sk) as an existing one (without the condition expression), you'll just overwrite the old one.

This counter-intuitive behavior of attribute_not_exists(pk) comes from the fact that the (pk,sk) is the item's identifier. Imagine you tried to add an item at (EM#test2@testing.com, SUB#900) (which doesn't exist). DynamoDB will look up that item and ask, "does it have a pk attribute?" It doesn't (since it doesn't exist), so the put will succeed. If you try to put it a second time, the pk attribute will exist, and so the second put will fail.

Another way of looking at this is that since each item must have a pk and an sk, attribute_not_exists(pk) == attribute_not_exists(sk) (they either both exist, or neither do), and so checking for one is equivalent for checking for both.

Upvotes: 2

Leeroy Hannigan
Leeroy Hannigan

Reputation: 19893

Pk is not an alias, it's your partition key name.

As you are doing a put item on a single item, that's where it does the condition check. So you can set your condition either on the PK or the SK and it will work as expected.

Upvotes: 0

Related Questions