AWS S3 list keys begins with a string

Question

I am using python in AWS Lambda function to list keys in a s3 bucket that begins with a specific id

for object in mybucket.objects.all():
            file_name = os.path.basename(object.key)
            match_id = file_name.split('_', 1)[0]

The problem is if a s3 bucket has several thousand files the iteration is very inefficient and sometimes lambda function times out

Here is an example file name

https://s3.console.aws.amazon.com/s3/object/bucket-name/012345_abc_happy.jpg

i want to only iterate objects that contains "012345" in the key name Any good suggestion on how i can accomplish that

Kannaiyan · Accepted Answer

Here is how you need to solve it.

S3 stores everything as objects and there is no folder or filename. It is all for user convenience.

aws s3 ls s3://bucket/folder1/folder2/filenamepart --recursive

will get all s3 objects name that matches to that name.

import boto3
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('bucketname')
for obj in my_bucket.objects.filter(Prefix='012345'):
    print(obj)

To speed up the list you can run multiple scripts parallelly.

Hope it helps.

Answers (2)