Reputation: 13286
You can download a file via boto3 from a RequesterPays S3 bucket, as follows:
s3_client.download_file('aws-naip', 'md/2013/1m/rgbir/38077/{}'.format(filename), full_path, {'RequestPayer':'requester'})
What I can't figure out is how to list the objects in the bucket... I get an authentication error when I try and call objects.all() on the bucket.
How can I use boto3 to enumerate the contents of a RequesterPays bucket? Please note this is a particular kind of bucket where the requester pays the S3 charges.
Upvotes: 8
Views: 10669
Reputation: 1
I had the same issue so here is the code:
import boto3
s3 = boto3.resource('s3')
for bucket in s3.buckets.all():
print(bucket.name)
client = boto3.client('s3')
result= client.list_objects(Bucket='bucketname',RequestPayer='requester')
for o in result['Contents']:
print(o['Key'])
The response to the query is a dictionary, and within that dictionary there is another dictionary named contents where the keys are the paths to the objects. You can check the response fields in the following link: List_objects documentation
Note : list_objects returns up to 1000 contents so you would have to iterate over with the next_marker property (I will update this answer if you would like the full list) . I guess you have already figured out how to setup the access key and secret key. Let me know if you need more details on that.
Upvotes: 0
Reputation: 385
You have to pass the RequestPayer
kwarg to the list_objects
method.
Also, according to the boto3 docs,
Note: ListObjectsV2 is the revised List Objects API and we recommend you use this revised API for new application development
Putting that together with pagination would look like:
import boto3
s3_client = boto3.client('s3')
def get_keys(bucket, prefix, requester_pays=False):
"""Get s3 objects from a bucket/prefix
optionally use requester-pays header
"""
extra_kwargs = {}
if requester_pays:
extra_kwargs = {'RequestPayer': 'requester'}
next_token = 'init'
while next_token:
kwargs = extra_kwargs.copy()
if next_token != 'init':
kwargs.update({'ContinuationToken': next_token})
resp = s3_client.list_objects_v2(
Bucket=bucket, Prefix=prefix, **kwargs)
try:
next_token = resp['NextContinuationToken']
except KeyError:
next_token = None
for contents in resp['Contents']:
key = contents['Key']
yield key
and would be used like
x = list(get_keys('aws-naip', 'co', requester_pays=True))
Upvotes: 2
Reputation: 10097
From boto3, we can see that there is a #S3.Client.list_objects
method. This can be used to enumerate objects:
import boto3
s3_client = boto3.client('s3')
resp = s3_client.list_objects(Bucket='RequesterPays')
# print names of all objects
for obj in resp['Contents']:
print 'Object Name: %s' % obj['Key']
Output:
Object Name: pic.gif
Object Name: doc.txt
Object Name: page.html
If you are getting a 401 then make sure that IAM user calling the API has s3:GetObject
permissions on the bucket.
Upvotes: -1