How to maximize WRUs in DynamoDB?

Question

I need to bulk insert records into a DynamoDB database on a weekly basis. I do this by dropping the table, creating a new table with On Demand capacity, then using BatchWriteItem to populate the table. According to the documentation, newly created tables with On Demand capacity can serve up to 4,000 WCUs. No matter what I try though the most I can get is 1,487 WCUs. I have tried the following:

Randomizing the order of the records before writing to avoid hot partitions
Splitting the records into groups and writing each group in parallel in its own Lambda function
Running the process in different environments, including EC2, Lambda, and locally (locally is slower, presumably due to latency)
Changing the capacity type from On Demand to Provisioned with 5 RCUs and 10,000 WCUs
Utilizing async/await in different ways to try and maximize throughput (I am using the AWS SDK for .NET)

Although the throughput differs from experiment to experiment and from execution to execution, 1,487 WCUs comes up often enough that there may be some significance to it.

What do I need to do to leverage the full 4,000 WCUs available to me?

Maurice · Accepted Answer

Your limitation appears to be on the writer side, I've written a small Python script to create and load test a table.

We can see that DynamoDB easily scales up to 4000 WRU with 8 worker processes, then throttles a bit and afterwards scales up again. To get more throughput, I'd have to add more writer processes:

Here is the script for your convenience:

import multiprocessing
import typing
import uuid

import boto3
import boto3.dynamodb.conditions as conditions

from botocore.exceptions import ClientError

TABLE = "speed-measurement"
NUMBER_OF_WORKERS = 8

def create_table_if_not_exists(table_name: str):

    try:
        boto3.client("dynamodb").create_table(
            AttributeDefinitions=[{"AttributeName": "PK", "AttributeType": "S"}],
            TableName=table_name,
            KeySchema=[{"AttributeName": "PK", "KeyType": "HASH"}],
            BillingMode="PAY_PER_REQUEST"
        )
    except ClientError as err:
        if err.response["Error"]["Code"] == 'ResourceInUseException':
            # Table already exists
            pass
        else:
            raise err

def write_fast(worker_num):

    table = boto3.resource("dynamodb").Table(TABLE)

    counter = 0

    with table.batch_writer() as batch:
        while True:

            counter += 1
            
            result = batch.put_item(
                Item={
                    "PK": str(uuid.uuid4())
                }
            )

            if counter % 1000 == 0:
                print(f"Worker: #{worker_num} Wrote item #{counter}")

def main():
    create_table_if_not_exists(TABLE)
    
    with multiprocessing.Pool(NUMBER_OF_WORKERS) as pool:
        pool.map(write_fast, range(NUMBER_OF_WORKERS))

if __name__ == "__main__":
    main()

Just run it with Python 3 and stop it with Ctrl+C once you're seeing the desired metrics. It will create a table and just write as fast as it can in 8 processes. You can also increase this number.

Source for the CloudWatch Graphic:

{
    "metrics": [
        [ { "expression": "m2/60", "label": "Write Request Units", "id": "e1", "color": "#2ca02c" } ],
        [ "AWS/DynamoDB", "WriteThrottleEvents", "TableName", "speed-measurement", { "yAxis": "right", "id": "m1" } ],
        [ ".", "ConsumedWriteCapacityUnits", ".", ".", { "stat": "Sum", "period": 1, "id": "m2", "visible": false } ]
    ],
    "view": "timeSeries",
    "stacked": false,
    "region": "eu-central-1",
    "stat": "Maximum",
    "period": 60,
    "yAxis": {
        "left": {
            "label": "Consumed Write Request Units",
            "showUnits": false
        },
        "right": {
            "label": "Write Throttle Events",
            "showUnits": false
        }
    },
    "annotations": {
        "horizontal": [
            {
                "color": "#9edae5",
                "label": "Initial Limit",
                "value": 4000,
                "fill": "below"
            }
        ]
    },
    "legend": {
        "position": "bottom"
    },
    "setPeriodToTimeRange": true
}

How to maximize WRUs in DynamoDB?

Answers (1)

Related Questions