Reputation: 811
I see "BatchGetItem" API from AWS, that can be used to retrieve multiple items from dynamodb. But the capacity consumed is n units for n items even if they are just 10 bytes long. Is there a cheaper way of retrieving those items? I see the "Query" API but it probably doesn't support ORing of the keys and even the IN operator.
Upvotes: 3
Views: 4152
Reputation: 17516
The cheapest way to get a large number of small items out of DynamoDB is a Query
.
BatchGetItems
costs the same the equivalent number of GetItem
calls. Only the latency is lower, but if you have a large number of items to get in different partitions, and you have a language that works well with parallelism, you'll actually get better performance doing a large number of parallel GetItem
calls.
One way to re-organize your data would be to store all the small items under the same partition key, and use a Query
to just get all the items inside that partition. You'll have to note the writes per second limitations, but if you're adding more items infrequently and you're more worried about the reads this should work great.
In a Query
, DynamoDB charges based on the size of data returned, irrespective of the number of items returned. But this seems to be the only place where these economics apply.
Based on what I know of DynamoDB, this makes sense. In a BatchGetItems
call, the items can easily be from different partitions (and even different tables), so there's not much efficiency — each item must be individually looked up and fetched from its partition on the DynamoDB network. If all the keys in the batch are from the same partition and sequential on the range key, that's a lucky coincidence, but DynamoDB can't exploit it.
In a Query
on the other hand, DynamoDB has perfect knowledge that it's only going to be talking directly to one single partition, and doing a sequential read inside that partition, so that call is much simpler and cheaper with savings passed on to customers.
Upvotes: 11