Reputation: 984
I have a DyanmoDB table that for the sake of this question looks like this:
I want to query the table for a subset of origins under a specific id. From my understanding, the only operator DynamoDB allows on sort keys in a Query are 'between', 'begins_with', '=', '<=' and '>='.
The problem is that my query needs a form of 'CONTAINS' because the 'origins' list is not necessarily ordered (for a between operator).
If this was SQL it would be something like:
SELECT * from Table where id={id} AND origin IN {origin_list}
My exact question is: What do I need to do to achieve this functionality in the most efficient way? should I change my table structure? maybe add a GSI? Open to suggestions.
I am aware that this can be achieved with a Scan operation but I want to have an efficient query. Same goes for BatchGetItem, I would rather avoid that functionality unless absolutely necessary.
Thanks
Upvotes: 11
Views: 19263
Reputation: 1008
Sort keys are designed for "sorting", and just run one query for each primary key and sort key. Each individual read would be cheap and fast, just depends how long list is.
Pseudo_code:
for origin in origin_list:
get_from_dynamo(primary_key=id, sort_key=origin)
Upvotes: -1
Reputation: 1166
This is a case for using Filter Expressions for Query. It has IN
operator
a IN (b, c, d) — true if a is equal to any value in the list — for example, any of b, c or d. The list can contain up to 100 values, separated by commas.
However, you cannot use condition expressions on key attributes.
A filter expression cannot contain partition key or sort key attributes. You need to specify those attributes in the key condition expression, not the filter expression.
So, what you could do is to use origin not as a sort key (or duplicate it with another attribute) to filter it after the query. Of course filter first reads all the items has that 'id' and filters later which consumes read capacity and less efficient but there is no other way to query that otherwise. Depending on your item sizes and query frequency and estimated number of returned items BatchGetItem could be a better choice.
Upvotes: 10