Gurudeepak
Gurudeepak

Reputation: 431

How to optimize DynamoDB Query response time?

We're using API gateway + Lambda function + DynamoDB to fetch data and using the DynamoDB query method. For 260.4KB data ( Total item count:675 | Scanned Count: 3327 ) it's taking 3.49s.

Requirement:

We've 4+ clients, we are calculating client sales user's data on daily basis and storing it in the DB.

Table Structure:

In Query - We are using Primary Key ClientId & Date to get the data.

Currently, we're using on-demand mode for the DynamoDB yet we feel the response time > 1s is too much.

Is there any way we can improve this using any AWS configurations?

Update[24/03/2021] In Lambda - We are using NodeJs.

module.exports.executeQuery = async(dynamoDbClient, queryInput) => {
  return await new Promise((resolve, reject) => {
    dynamoDbClient.query(queryInput, (err, users) => {
      if (err) {
        reject(handleQueryError(err));
      }
      else {
        resolve({
          statusCode: 200,
          users,
        });
      }
    });
  });
};

Memory Provisioned to Lambda = 128 MB

Upvotes: 3

Views: 3807

Answers (2)

Maurice
Maurice

Reputation: 13108

As suggested in the comments, I'd start by increasing the memory size of the Lambda function.

Lambda CPU performance scales with memory and from my experience parsing larger responses from DynamoDB benefits a lot from more CPU performance.

I did a performance analysis in a blog (disclaimer: my employers techblog, it's on topic - albeit for python) a couple of days ago and found significant differences between the performance for different memory sizes.

Upvotes: 4

F_SO_K
F_SO_K

Reputation: 14799

You're getting 3327 results so the ~3.5s response time doesn't surprise me. Sounds about right from my experience.

The underlying problem here is a lack of threads or parallel processing. You can easily prove that is the case, run this CLI command:

aws dynamodb scan --table-name YOURTABLENAME --total-segments X --segment 0 --select COUNT

Replace YOURTABLENAME and X where X should be the number of of MBs of data in your table. So if you have 100MB of data, use 100.

This will do a parallel scan with X threads. It will return in about 1s and its getting every item in your table.

You can then try a scan with --total-segments 1 (which runs with one thread) and see how much longer it takes.

What this demonstrates is the needs to get large amounts of data in parallel threads.

Your partitions are too large. If you try a key with less data, perhaps 10's of records I expect the query is fast.

You might want to look into sharding techniques to reduce the amount of data in your partitions, then you can Query those partitions in parallel. Note that DynamoDB does not provide a BatchQuery method, which is a shame, so you have to write your own parallel Query methods.

Upvotes: 5

Related Questions