ricardofunke
ricardofunke

Reputation: 71

How to set retry delay options for DynamoDB using Boto3 with Python?

I'm trying to avoid the ProvisionedThroughputExceededException by setting a custom exponential backoff using the "base" option like we can do in JavaScript according to this answer:

AWS.config.update({
  maxRetries: 15,
  retryDelayOptions: {base: 500}
});

As it's explained in this documentation, the "base" parameter defines the number used to increase the delay, so "base: 500" would increase the delay like this: 500, 1000, 1500, ...

I'm trying to make the same settings with Boto3 using Python 3.8, but the Boto3 Documentation seems to only allow setting the maximum retries attempts, not the delay options:

from botocore.client import Config

config = Config(
   retries = {
      'max_attempts': 10,
      'mode': 'standard'
   }
)

The "mode" option only gets three values: "legacy", "standard" and "adaptive". The doc also mentions a _retry.json file where these options are described and it looks like the "base" option is hardcoded in this file:

"dynamodb": {
      "__default__": {
        "max_attempts": 10,
        "delay": {
          "type": "exponential",
          "base": 0.05,
          "growth_factor": 2
        }

So, my question is: Is there any way to set the exponential backoff using Boto3 with Python for DynamoDB?

Upvotes: 1

Views: 3882

Answers (2)

ricardofunke
ricardofunke

Reputation: 71

As mentioned by @Ermiya, I had to implement it myself. I didn't want to modify the boto3 default settings, so we have to catch the exception and keep paginating from where it stops.

This is how we can do:

Without Paginator (much simpler)

from time import sleep
import boto3
from boto3.dynamodb.conditions import Key

dynamodb = boto3.resource('dynamodb', region_name='us-west-2')
db_table = dynamodb.Table('<my_table_name>')

query_params = {
        "TableName": '<my_dynamodb_table>',
        "IndexName": 'movies-index',
        "KeyConditionExpression": Key('movies').eq('thriller'),
}
retries = 1
max_retries = 6
while True:
    if retries >= max_retries:
        raise Exception(f'ProvisionedThroughputExceededException: max_retries was reached: {max_retries}')
    try:
        page = db_table.query(**selected_table_type)
    except ClientError as err:
        if 'ProvisionedThroughputExceededException' not in err.response['Error']['Code']:
            raise
        sleep(2 ** retries)
        retries += 1
        continue
    else:
        retries = 1
    yield from page.get('Items')
    if page.get('LastEvaluatedKey'):
        selected_table_type.update(
            ExclusiveStartKey=page['LastEvaluatedKey']
        )
        sleep(2)
    else:
        break

Using Paginator:

from time import sleep
import boto3
from boto3.dynamodb.conditions import Key
from boto3.dynamodb.conditions import ConditionExpressionBuilder
from boto3.dynamodb.types import TypeSerializer, TypeDeserializer

db_client = boto3.client('dynamodb', region_name='us-west-2')

td = TypeDeserializer()
ts = TypeSerializer()
query_params = {
        "TableName": '<my_dynamodb_table>',
        "IndexName": 'movies-index',
        "KeyConditionExpression": Key('movies').eq('thriller'),
    }
builder = ConditionExpressionBuilder()
condition = query_params["KeyConditionExpression"]
expr = builder.build_expression(condition, is_key_condition=True)
query_params.update({
    "KeyConditionExpression": expr.condition_expression,
    "ExpressionAttributeNames": expr.attribute_name_placeholders,
    "ExpressionAttributeValues": {k: ts.serialize(v) for k, v in expr.attribute_value_placeholders.items()},
})
total = 0
paginator = db_client.get_paginator('query')
pages = paginator.paginate(**query_params)
retries = 1
max_retries = 6
while True:
    if retries >= max_retries:
        raise Exception(f'ProvisionedThroughputExceededException: max_retries was reached: {max_retries}')
    try:
        for page in pages:
            retries = 1
            next_token = page.get('NextToken')
            for db_item in page.get('Items'):
                db_item = {k: td.deserialize(v) for k, v in db_item.items()}
                yield db_item
            total += page.get('Count')
            print(f"{total=}", end='\r')
    except ClientError as err:
        if 'ProvisionedThroughputExceededException' not in err.response['Error']['Code']:
            raise
        query_params.update(StartingToken=next_token)
        sleep(2 ** retries)
        retries += 1
    else:
        break
print(f"{total=}")

Upvotes: 2

Ermiya Eskandary
Ermiya Eskandary

Reputation: 23672

This is hardcoded for Boto3 unfortunately & there is no way to modify the base retry delay using the Python SDK. Unless you write your own wrapper around the SDK calls, this isn't possible out of the box.

It may be worth to create an issue in the public repository for it to be picked up, or contribute directly.

Upvotes: 2

Related Questions