Reputation: 8470
I am trying to replicate the AWS CLI ls
command to recursively list files in an AWS S3 bucket. For example, I would use the following command to recursively list all of the files in the "location2" bucket.
aws s3 ls s3://location2 --recursive
What is the AWS SDK for Python (i.e. boto3
) equivalent of aws s3 ls s3://location2 --recursive
?
Upvotes: 9
Views: 22091
Reputation: 19260
aws s3 ls s3://logs/access/20230104/14/ --recursive
To list all files complete path along with error handling
s3_client = boto3.client('s3')
paginator = s3_client.get_paginator('list_objects_v2')
pages = paginator.paginate(Bucket="logs", Prefix="access/20230104/14/")
for page in pages:
try:
for obj in page['Contents']:
print(obj['Key'])
except KeyError:
print("No files exist")
exit(1)
Upvotes: 3
Reputation: 2565
You'd need to use paginators:
import boto3
client = boto3.client("s3")
bucket = "my-bucket"
paginator = client.get_paginator('list_objects')
page_iterator = paginator.paginate(Bucket=bucket)
for page in page_iterator:
for obj in page['Contents']:
print(f"s3://{bucket}/{obj["Key"]}")
Upvotes: 6
Reputation: 1964
You can also use minio-py client library, its open source & compatible with AWS S3.
list_objects.py example below, you can refer to the docs for additional information.
from minio import Minio client = Minio('s3.amazonaws.com', access_key='YOUR-ACCESSKEYID', secret_key='YOUR-SECRETACCESSKEY') # List all object paths in bucket that begin with my-prefixname. objects = client.list_objects('my-bucketname', prefix='my-prefixname', recursive=True) for obj in objects: print(obj.bucket_name, obj.object_name.encode('utf-8'), obj.last_modified, obj.etag, obj.size, obj.content_type)
Hope it helps.
Disclaimer: I work for Minio
Upvotes: 1
Reputation: 8470
Using the higher level API and use resources is the way to go.
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('location2')
bucket_files = [x.key for x in bucket.objects.all()]
Upvotes: 2
Reputation: 1692
There is no need to use the --recursive option while using the AWS SDK as it lists all the objects in the bucket using the list_objects method.
import boto3
client = boto3.client('s3')
client.list_objects(Bucket='MyBucket')
Upvotes: 3