Get Off My Lawn
Get Off My Lawn

Reputation: 36311

DynamoDB Count Group By

We are trying to search a dynamodb, and need to get count of objects within a grouping, how can this be done?

I have tried this, but when adding the second number, this doesn't work:

$search = array(
    'TableName'     => 'dev_adsite_rating',
    'Select'        => 'COUNT',
    'KeyConditions' => array(
        'ad_id' => array(
            'ComparisonOperator' => 'EQ',
            'AttributeValueList' => array(
                array('N' => 1039722, 'N' => 1480)
            )
        )
    )
);
$response = $client->query($search);

The sql version would look something like this:

select ad_id, count(*) 
from dev_adsite_rating
where ad_id in(1039722, 1480)
group by ad_id;

So, is there a way for us to achieve this? I can not find anything on it.

Upvotes: 5

Views: 12423

Answers (2)

Erik Norman
Erik Norman

Reputation: 11

You could add a Lambda function triggered by the DynamoDBStream, to aggregate your data on the fly, in your case add +1 to the relevant counters. Your search function would then simply retrieve the aggregated data directly.

Example: if you have a weekly online voting system where you need to store each vote (also to check that no user votes twice), you could aggregate the votes on the fly using something like this:

export const handler: DynamoDBStreamHandler = async (event: DynamoDBStreamEvent) => {
  await Promise.all(event.Records.map(async record => {
    if (record.dynamodb?.NewImage?.vote?.S && record.dynamodb?.NewImage?.week?.S) {
      await addVoteToResults(record.dynamodb.NewImage.vote.S, record.dynamodb.NewImage.week.S)
    }
  }))
}

where addVoteToResults is something like:

export const addVoteToResults = async (vote: string, week: string) => {
  await dynamoDbClient.update({
      TableName: 'table_name',
      Key: { week: week },
      UpdateExpression: 'add #vote :inc',
      ExpressionAttributeNames: {
        '#vote': vote
      },
      ExpressionAttributeValues: {
        ':inc': 1
      }
    }).promise();
}

Afterwards, when the voting is closed, you can retrieve the aggregated votes per week with a single get statement. This solution also helps spreading the write/read load rather than having a huge increase when executing your search function.

Upvotes: 1

Raymond Lin
Raymond Lin

Reputation: 491

Trying to perform a query like this on DynamoDB is slightly trickier than in an SQL world. To perform something like this, you'll need to consider a few things

  1. EQ ONLY Hash Key: To perform this kind of query, you'll need to make two queries (i.e. ad_id EQ 1039722 / ad_id EQ 1480)
  2. Paginate through query: Because dynamodb returns your result set in increments, you'll need to paginate through your results. Learn more here.
  3. Running "Count": You can take the "Count" property from the response and add it to the running total as you're paginating through the results of both queries. Query API

Upvotes: 1

Related Questions