Kush Patel
Kush Patel

Reputation: 1251

AWS Error "Calling the invoke API action failed with this message: Rate Exceeded" when I use s3.get_paginator('list_objects_v2')

Some third party application is uploading around 10000 object to my bucket+prefix in a day. My requirement is to fetch all objects which were uploaded to my bucket+prefix in last 24 hours.

There are so many files in my bucket+prefix.

So I assume that when I call

response = s3_paginator.paginate(Bucket=bucket,Prefix='inside-bucket-level-1/', PaginationConfig={"PageSize": 1000})

then may be it makes multiple calls to S3 API and may be that's why it is showing Rate Exceeded error.

Below is my Python Lambda Function.

import json
import boto3
import time
from datetime import datetime, timedelta


def lambda_handler(event, context):
    s3 = boto3.client("s3")
    from_date = datetime.today() - timedelta(days=1)
    string_from_date = from_date.strftime("%Y-%m-%d, %H:%M:%S")
    print("Date :", string_from_date)
    s3_paginator = s3.get_paginator('list_objects_v2')
    list_of_buckets = ['kush-dragon-data']
    bucket_wise_list = {}
    for bucket in list_of_buckets:

        response = s3_paginator.paginate(Bucket=bucket,Prefix='inside-bucket-level-1/', PaginationConfig={"PageSize": 1000})

        filtered_iterator = response.search(
            "Contents[?to_string(LastModified)>='\"" + string_from_date + "\"'].Key")

        keylist = []
        for key_data in filtered_iterator:

            if "/" in key_data:
                splitted_array = key_data.split("/")
                if len(splitted_array) > 1:
                    if splitted_array[-1]:
                        keylist.append(splitted_array[-1])
            else:
                keylist.append(key_data)

        bucket_wise_list.update({bucket: keylist})

    print("Total Number Of Object = ", bucket_wise_list)

    # TODO implement
    return {
        'statusCode': 200,
        'body': json.dumps(bucket_wise_list)
    }

So when we execute above Lambda Function then it shows below error.

"Calling the invoke API action failed with this message: Rate Exceeded."

Can anyone help to resolve this error and achieve my requirement ?

Upvotes: 0

Views: 10540

Answers (2)

ThisGuyCantEven
ThisGuyCantEven

Reputation: 1267

This is most likely due to you reaching your quota limit for AWS S3 API calls. The "bigger hammer" solution is to request a quota increase, but if you don't want to do that, there is another way using botocore.Config built in retries, for example:

import json
import time
from datetime import datetime, timedelta
from boto3 import client
from botocore.config import Config

config = Config(
   retries = {
      'max_attempts': 10,
      'mode': 'standard'
   }
)
def lambda_handler(event, context):
    s3 = client('s3', config=config)

###ALL OF YOUR CURRENT PYTHON CODE EXACTLY THE WAY IT IS###

This config will use exponentially increasing sleep timer for a maximum number of retries. From the docs:

  • Any retry attempt will include an exponential backoff by a base factor of 2 for a maximum backoff time of 20 seconds.

There is also an adaptive mode which is still experimental. For more info, see the docs on botocore.Config retries

Another (much less robust IMO) option would be to write your own paginator with a sleep programmed in, though you'd probably just want to use the builtin backoff in 99.99% of cases (even if you do have to write your own paginator). (this code is untested and isn't even asynchronous, so the sleep will be in addition to the wait time for a page response. To make the "sleep time" exactly sleep_secs, you'll need to use concurrent.futures or asyncio (AWS built in paginators mostly use concurrent.futures)):

from boto3 import client
from typing import Generator
from time import sleep

def get_pages(bucket:str,prefix:str,page_size:int,sleep_secs:float) -> Generator:
    s3 = client('s3')
    page:dict = client.list_objects_v2(
        Bucket=bucket,
        MaxKeys=page_size,
        Prefix=prefix
    )
    next_token:str = page.get('NextContinuationToken')
    yield page
    while(next_token):
        sleep(sleep_secs)
        page = client.list_objects_v2(
            Bucket=bucket,
            MaxKeys=page_size,
            Prefix=prefix,
            ContinuationToken=next_token
        )
        next_token = page.get('NextContinuationToken')
        yield page

Upvotes: 0

Devyl
Devyl

Reputation: 645

This is probably due to your account restrictions, you should add retry with some seconds between retries or increase pagesize

Upvotes: 0

Related Questions