Reputation: 1198
I have to get the items from a table similar to what I have provided below where the Result and Accuracy don't exist or Null. The table is having a lot of data and it is growing daily. I tried to get the items from the dynamo DB using the scan. But it is taking a lot of time to show the result. Can anyone suggest me how to run the query or scan which will provide me the result faster?
Dynamo DB Table(Sample):
|ID(primary)|date(String)|Result(number)|Accuracy(Number)|
| edfca | 20180101 | 12 | 59.99 |
| yrdfa | 20180109 | 72 | 91.49 |
| arfead | 20180101 | 122 | 32.12 |
| readssda | 20180112 | | |
| geeddar | 20180123 | 3 | 81.11 |
| rogeasa | 20180203 | | |
| swiwads | 20180205 | 32 | 13.12 |
Currently, I'm using the following
java:
ScanSpec scanSpec = new ScanSpec();
scanSpec.withFilterExpression("attribute_not_exists(accuracy)");
scanSpec.setMaxResultSize(100);
table.scan(scanSpec);
The above scan operation is very slow.
I'm new to dynamo DB and I googled it and found it's better to use a query than using the scan operation.
Can I use the query operation for getting the data I need? If yes please provide an example
Upvotes: 1
Views: 11862
Reputation: 3029
When you do a query you have to specify the Partition Key. Given your table, assuming you have not defined any secondary index, you can only query for one partition key. Useless.
Should you really want to improve the performance I can imagine a solution. When you write items in your table, define a column hasNull: string
(for example) and set it to 'y'
for all rows with fills the condition. Let it undefined for all other items.
Later, define a Global Secondary Index named MY_GSI_NAME
:
hasNull
ID
KEYS_ONLY
(or ALL
, see dynamodb docs)In node.js you may query your table with:
const params = {
TableName: 'MY_TABLE_NAME',
IndexName: 'MY_GSI_NAME',
KeyConditionExpression: '#pk = :pk',
ExpressionAttributeNames: {
'#pk': 'hasNull',
},
ExpressionAttributeValues: {
':pk': 'y',
},
};
return MY_DOCUMENT_CLIENT.query(params).promise();
Doing that you'll get all the items with hasNull === 'y'
.
That query has the advantage that only the items you are interested in are scanned since the index is sparse (hasNull
is missing for the items you do not need).
Hope this give you a starting point to solve your problem. Ask more to know more :)
Bonne chance!
Upvotes: 2