Reputation: 9804
I can grab and read all the objects in my AWS S3 bucket via
s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket')
all_objs = bucket.objects.all()
for obj in all_objs:
pass
#filter only the objects I need
and then
obj.key
would give me the path within the bucket.
Is there a way to filter beforehand for only those files respecting a certain starting path (a directory in the bucket) so that I'd avoid looping over all the objects and filtering later?
Upvotes: 31
Views: 83665
Reputation: 965
For folks using boto3.client('s3')
rather than boto3.resource('s3')
, you can use the 'Prefix' key to filter out objects in the s3 bucket
import boto3
s3 = boto3.client('s3')
params = {
"Bucket": "HelloWorldBucket",
"Prefix": "Happy"
}
happy_objects = s3.list_objects_v2(**params)
The above code snippet will fetch all files in the 'Happy' folder in the 'HelloWorldBucket'.
PS: folder in s3 is just a construct and is implemented as a prefix to the file/object name.
Upvotes: 8
Reputation: 179
If we just need list of object-keys then, bucket.objects.filter
is a better alternative to list_objects or list_object_v2, as those functions have limit of 1000 objects. Reference: list_objects_v2
Upvotes: 2
Reputation: 52939
Use the filter
[1], [2] method of collections like bucket.
s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket')
objs = bucket.objects.filter(Prefix='myprefix')
for obj in objs:
pass
Upvotes: 64