liveleker
liveleker

Reputation: 11

Hundreds of parallel dynamodb queries

I'm trying to find the best practice for doing hundreds of parallel dynamodb queries for one single request. I am currently using python, but I'm open to any languages and frameworks that works best for this use case. Here is basically what I want to do, I shortened it to only 4 values here but in the end I would like it to query 500 at once.

import boto3
import time
from boto3.dynamodb.conditions import Key

variables = {'random1':None,'random2':None,'random3':None,'random500':None}

table = boto3.resource('dynamodb','eu-west-1').Table('sometable')
for v in variables:
    variables[v]=table.query(KeyConditionExpression=Key('k').eq(v),Select='COUNT')['Count']

print(variables)
# expected output: {'random1': 12, 'random2': 30, 'random3': 230, 'random500': 5}

So I'm doing select count queries to get the distinct count for each key in the table. The output of this "function" is something I need to return in the service. For each of these queries, the response time is great, like 40ms. But obviously, running this sequentially will scale linearly which doesn't work as I would want to end up with a sub 150ms (maximum) for all of these 500 variables.

Has anyone done anything similar? Any advice would be greatly appreciated!

Upvotes: 1

Views: 1996

Answers (1)

Charles
Charles

Reputation: 23823

My advice would be to not do this.

If you need aggregations in DDB, the the preferred approach would be to enable streams and have a Lamba update/write an aggregation entry in the existing table (or a new one).

Here's a good article... Real-Time Aggregation with DynamoDB Streams

Upvotes: 2

Related Questions