Kartik Kodag
Kartik Kodag

Reputation: 85

How to get recent entries from Dynamo DB

We have a table in Dynamo DB, attributes to be saved are as below.

We are using API gateway and integrating it with Dynamo DB directly. We want to get the latest N records after filtering.

We thought of using Scan with limit of N, but this is not working as "limit" limits the scanning before filtering and gives us inconsistent result. We have created GSI on CreatedDate for the below post request to work.

{
  "TableName": "<TABLE NAME>",
  "Limit": $input.params('pageSize'),
  "FilterExpression": "CreatedDate > :v1",
  "ExpressionAttributeValues": {
      ":v1": {"S": "2024-05-03T06:32:22"}
  },
  "ReturnConsumedCapacity": "TOTAL",

}

We also want to sort the result set in desc order of CreatedDate, latest first.

Do you think it is possible, if not do we need to consider DB change?

Upvotes: 1

Views: 166

Answers (1)

Leeroy Hannigan
Leeroy Hannigan

Reputation: 19883

For this, I would add a single value to your items (assuming your write throughput doesn't exceed 1000WCU per second).

ProcessId CreatedDate Filename GSI1PK Random
123 2024-03-02T20:00:000Z myfile100 1 some values
001 2024-02-01T20:00:000Z myfile202 1 some values
928 2024-03-04T20:00:000Z myfile412 1 some values
102 2024-04-05T20:00:000Z myfile339 1 some values

Now you create an index on GSI1PK as partition key and CreatedDate as sort key:

GSI1PK CreatedDate Filename ProcessId Random
1 2024-02-01T20:00:000Z myfile202 001 some values
1 2024-03-02T20:00:000Z myfile100 123 some values
1 2024-03-04T20:00:000Z myfile412 928 some values
1 2024-04-05T20:00:000Z myfile339 102 some values

Notice how all of your items are now stored in ascending order of CreatedDate. Now you issue a Query and have ScanIndexForward=False which will read the latest items first (DESC).

{
  "TableName": "<TABLE NAME>",
  "IndexName": <INDEX NAME>",
  "Limit": $input.params('pageSize'),
  "KeyConditionExpression": "GSI1PK = :v1",
  "ExpressionAttributeValues": {
      ":v1": {"S": "1"}
  },
  "ReturnConsumedCapacity": "TOTAL",
  "ScanIndexForward": False
}

Of course it doesn't have to be an index, you could use my index schema here as your base table, but that ultimately depends on your specific data access requirements.

More info on this blog post

Upvotes: 0

Related Questions