Cory Mawhorter
Cory Mawhorter

Reputation: 1821

Weird result when querying a dynamodb GSI with a non-existent start key. Bug or feature?

I'm querying a global secondary index with a start key that does not exist and I'm seeing some weird results. Is this a ddb bug or (un?)documented behavior? Are there workarounds?

I have a table with the primary hashKey being "id" and the ShopIndex GSI being "shop". Both with no rangeKeys.

When I query using a start key "id" that does not exist, I'd expect to get back an empty response with a correct last evaluated key since there are no results to return after an invalid start key.

However, what I'm seeing is a seemingly random result(s) are returned.

Code example:

This snippet returns one item from the index. Not the first result. Not the last.

const AWS = require('aws-sdk');
const dynamodb = new AWS.DynamoDB();
dynamodb.query({
    TableName: 'products',
    IndexName: 'ShopIndex',
    Limit: 1,
    ExpressionAttributeValues: {
        ':shop': { S: 'shop_KgHqp62taEV' }
    },
    KeyConditionExpression: 'shop = :shop',
    ExclusiveStartKey: {
        id: { S: 'doesnotexist' },
        shop: { S: 'shop_KgHqp62taEV' },
    },
}, (err, result) => {
    if (err) throw err;
    console.log('result', JSON.stringify(result, null, 2));
});

If I remove the start key entirely, it returns a different item.

If I add it back and set ScanIndexForward: false, it returns a third different item.

If I remove the start key AND set ScanIndexForward: false, it returns a fourth different item.

Wtf.

As far as I can tell, there is no way to detect this other than look up the "id" and confirm it exists before attempting to use it as the start key?

Did I miss this in the docs, or is this yet another batteries not included aws landmine that I need to work around?

Upvotes: 1

Views: 1976

Answers (1)

Costin
Costin

Reputation: 3029

It is a feature!

In your table you have more ids for the same shop.

Just imagine the following example:

id shop 41 A 22 A 93 A 34 A

The items are in memory in that order: 41, 22, 93, 34.

When you ask for one item without any ExclusiveStartKey, you will get 41 (the first scanned).

When you say that the starting key (i.e. last evaluated key) is 93, you will get the next one: 34.

When you say that the starting key is 93, but ScanIndexForward: false, you will look backward and you'll get 22.

In order to better understand, run the queries without Limit: 1. You should notice the difference in the results.

So, it is definitely a feature! A very important one, because with these features and range key you can do wonderful queries. I did! ;)

Upvotes: 1

Related Questions