user3772393
user3772393

Reputation: 363

Uniqueness in DynamoDB secondary index

Question:

DynamoDB tables with a primary key that is a composite hash-range key are unique. Does this extend to secondary indices too?

Example:

I have a comments DynamoDB table with a post_id primary key and comment_id range key. Additionally there's a local secondary index with a date-user_id range key.

Each entry is a comment a user has left on post. The purpose of the secondary index is to count how many unique users left a comment on a post on a specific day.

Entry 1: post_id: 1 comment_id: 1 date-user_id: 2014_06_24-1

Entry 2: post_id: 1 comment_id: 2 date-user_id: 2014_06_24-1

Entry 3: post_id: 1 comment_id: 3 date-user_id: 2014_06_24-2

When I do a query specifying the secondary index, and pass in a condition of post_id equals 1 and a date-user_id equals 2014_06_24-1, I'm getting a count of 2 and I'm expecting a count of 1.

Why does the secondary index have two entries with the same primary key/range key.

Upvotes: 36

Views: 36755

Answers (5)

Mike Hornblade
Mike Hornblade

Reputation: 1584

Secondary indexes don't guarantee uniqueness. From the docs:

In a DynamoDB table, each key value must be unique. However, the key values in a global secondary index do not need to be unique.

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html#GSI.scenario

Upvotes: 49

Ihar Kryvanos
Ihar Kryvanos

Reputation: 21

DynamoDB not ensure uniquness for secondary indexes, but it guarantee uniquness of primary and we could use it to implement our own uniqu-indexes.

In short you need to create several records for each comment, first to save comment byitself and second one to handle date-user_id unique index. Also it would be required to add conditional expression each time when you insert new or update the record.

Here how our records will looks like

{
  pk: '<post_id>_<comment_id>',
  record_type: 'record',
  date-user_id: '2022-05-13_<user-id>',
  comment: 'some comment'
}
{
  pk: 'unique-index#2022-05-13_<user-id>',
  record_type: 'unique-index'
}

Each time when you would insert new comment to db, you would need to insert both records using DynamoDB write transaction and check that there is no any other record with the same pk, for both of them.

Here is my article on how. You could find detailed description and wide code examples there

Upvotes: 0

aaa90210
aaa90210

Reputation: 12093

It is actually possible to ensure the uniqueness of a GSI by combining transactions and multiple tables.

E.g. lets say your main table has these indexes:

record_id (partition key) name (GSI)

If you want to ensure that "name" is unique in this table, create a secondary table with the following indexes:

name (partition key)

Then when creating documents in the main table, do it as part of a transaction where you also create a document in the second table with a special conditions ensuring that the name does not exist already, e.g. the transaction would have the following updates:

PutItem(Table=mainTable, ConditionExpression='attribute_not_exists(#RECORD_ID)',...)

PutItem(Table=namesTable,ConditionExpression='attribute_not_exists(#NAME)',...)

Removing an item from the main table can also ensure both documents are removed from both tables in a transaction, which basically ensures referential integrity.

Upvotes: 2

Sanjay Verma
Sanjay Verma

Reputation: 1616

NO they don't. Indexes are updated asynchronously, meaning they'll be eventually consistent, and which also means that dynamodb won't be able to enforce uniqueness at the time when you make the update call (it won't check for uniqueness on the secondary indexes, as that's an async operation; if it does, it will have no way to return a failure, as the real-time call would already have finished).

On a side note, that's also the reason why you can only perform Scan or Query on a GSI index, but not GetItem (i.e. GetItem is expected to return one item, but there can be many corresponding to given secondary index in the absence of uniqueness constraint).

Upvotes: 32

Vivek Halder
Vivek Halder

Reputation: 93

Each item in a Local Secondary Index (LSI) has a 1:1 relationship with the corresponding item in the table. In the example above, while entry 1 and entry 2 in the LSI have the same range key value, the item in the table they point to is different. Hence Index keys ( hash or hash+range) are not unique.

Global Secondary Index (GSI) are similar to LSI in this aspect. Every GSI item contains the table hash and range keys (of the corresponding item). More details are available at http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/GSI.html#GSI.Projections

Upvotes: 0

Related Questions