Reputation: 849
I'm trying to implement pagination in DynamoDB with the Java SDK.
I have simple data model with a HashKey
id and a date as RangeKey
. I want to query for all dates after a given one. This works so far but the problem is the pagination part using the last evaluated key.
When querying the last page, the lastEvaluatedKey
is not null, it still points to the last item of the last page queried. Another query with this key set as èxclusiveStartKey
then returns 0 results with a null
lastEvaluatedKey
.
My code looks like the following:
var query = new DynamoDBQueryExpression<DynamoModel>();
var keyCondition = ImmutableMap.<String, AttributeValue>builder()
.put(":v_userid", new AttributeValue().withS(userId))
.put(":v_date", new AttributeValue().withS(date.toString()))
.build();
if (!StringUtils.isEmpty(lastKey)) {
query.setExclusiveStartKey(ImmutableMap.<String, AttributeValue>builder()
.put("userId", new AttributeValue().withS(userId))
.put("date", new AttributeValue().withS(lastKey)).build());
}
query.withKeyConditionExpression("userId = :v_userid AND date >= :v_date");
query.withExpressionAttributeValues(keyCondition);
query.setLimit(2);
QueryResultPage<DynamoModel> resultPage = mapper.queryPage(DynamoModel.class, query);
Does anybody know why the lastEvaluatedKey
is not null when reaching the last item matching the KeyCondition
? When I only save items that match the condition, the LastEvaluatedKey
is null as expected.
Upvotes: 6
Views: 15219
Reputation: 7669
This is the expected behavior of DynamoDB.
If
LastEvaluatedKey
is not empty, it does not necessarily mean that there is more data in the result set. The only way to know when you have reached the end of the result set is whenLastEvaluatedKey
is empty. (source)
This is a design decision by AWS. The most likely explanation I can think of is that in order to have a LastEvaluatedKey
iff there are more items, they would have the keep scanning to find more items, and if you’re using a filter expression, they might have to scan the rest of the partition to determine whether or not there are more items. It’s a choice that helps to minimize the latency of the query (and scan) operation.
Upvotes: 10