Reputation: 463
I need make a scan with limit and a condition on DynamoDB.
The docs says:
In a response, DynamoDB returns all the matching results within the scope of the Limit value. For example, if you issue a Query or a Scan request with a Limit value of 6 and without a filter expression, DynamoDB returns the first six items in the table that match the specified key conditions in the request (or just the first six items in the case of a Scan with no filter). If you also supply a FilterExpression value, DynamoDB will return the items in the first six that also match the filter requirements (the number of results returned will be less than or equal to 6).
The code (NODEJS):
var params = {
ExpressionAttributeNames: {"#user": "User"},
ExpressionAttributeValues: {":user": parseInt(user.id)},
FilterExpression: "#user = :user and attribute_not_exists(Removed)",
Limit: 2,
TableName: "XXXX"
};
DynamoDB.scan(params, function(err, data) {
if (err) {
dataToSend.message = "Unable to query. Error: " + err.message;
} else if (data.Items.length == 0) {
dataToSend.message = "No results were found.";
} else {
dataToSend.data = data.Items;
console.log(dataToSend);
}
});
Table XXXX definitions:
In code above, if I remove the Limit
parameter, DynamoDB will return the items that match the filter requirements. So, the conditions are ok. But when I scan with Limit
parameter, the result is empty.
The XXXX table, has 5 items. Only the 2 firsts have the Removed
attribute. When I scan without Limit
parameter, DynamoDB returns the 3 items without Removed
attribute.
What i'm doing wrong?
Upvotes: 20
Views: 42090
Reputation: 598
Small hack - Iterate till you get the results
lastEvaluatedKey = null;
do {
if(lastEvaluatedKey != null) {
// query or scan data with last evaluated key
} else {
// query or scan data WITHOUT last evaluated key
}
lastEvaluatedKey == key of last item retrieved
} while(lastEvaluatedKey != null && retrievedResultSize == 0); // == 0 or < yourLimit
If the number of items retrieved is 0 and lastEvaluatedKey is not null that means it has scanned or queried the number of rows which match to your limit. (and result size is zero because they didn't match the filter expression)
Upvotes: -1
Reputation: 11
You might be able to get what you need by using a secondary index. Using the classic RDB example, customer - order example: you have one table for customers and one for orders. The Orders table has a Key consisting of Customer - HASH, Order - RANGE. So if you wanted to get the latest 10 orders, there would be no way to do it without a scan
But if you create a Global Secondary Index on orders of "Some Constant" -- HASH, Date RANGE, and queried against that index, they query would do what you want and only charge you for the RCUs involved with the records returned. No expensive scan needed. Note, writes will be more expensive, but in most cases, there are many more reads than writes.
Now you have your original problem if you want to get the 10 biggest orders for a day larger than $1000. The query would return the last 10 orders, and then filter out those less than $1000.
In this case, you could create a computed key of Date-OrderAmount, and queries against that index would return what you want.
It's not as simple as SQL, but you need to think about access patterns in SQL too. If if you have a lot of data, you need to create Indexes in SQL or the DB will happily to table scans on your behalf, which will impair performance and raise your costs.
Note that everything I proposed is normalized in the sense that there is only one source of truth. You are not duplicating data -- you are merely recasting views of it to get what you need from DynamoDB.
Bear in mind that the CONSTANT as a HASH s subject to the 10GB per partition limit, so you would need to design around it if you had a lot of active data. For example, depending on your expected access pattern, you could use Customer and not a constant as a HASH. Or use STreams to organize the data (or subsets) in other ways.
Upvotes: 1
Reputation: 51
Also ran into this issue, i guess you will just have to scan the whole table to a max of 1 MB
Scan
The result set from a Scan is limited to 1 MB per call. You can use the LastEvaluatedKey
from the scan response to retrieve more results.
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Limits.html
Upvotes: 5
Reputation: 201138
From the docs that you quoted:
If you also supply a FilterExpression value, DynamoDB will return the items in the first six that also match the filter requirements
By combining Limit and FilterExpression you have told DynamoDB to only look at the first two items in the table, and evaluate the FilterExpression against those items. Limit in DynamoDB can be confusing because it works differently from limit
in a SQL expression in a RDBMS.
Upvotes: 28