Borealis
Borealis

Reputation: 8470

How to recursively list files in AWS S3 bucket using AWS SDK for Python?

I am trying to replicate the AWS CLI ls command to recursively list files in an AWS S3 bucket. For example, I would use the following command to recursively list all of the files in the "location2" bucket.

aws s3 ls s3://location2 --recursive

What is the AWS SDK for Python (i.e. boto3) equivalent of aws s3 ls s3://location2 --recursive?

Upvotes: 9

Views: 22091

Answers (5)

Kanagavelu Sugumar
Kanagavelu Sugumar

Reputation: 19260

aws s3 ls s3://logs/access/20230104/14/ --recursive

To list all files complete path along with error handling

s3_client = boto3.client('s3')
paginator = s3_client.get_paginator('list_objects_v2')
pages = paginator.paginate(Bucket="logs", Prefix="access/20230104/14/")

for page in pages:
    try:
        for obj in page['Contents']:
            print(obj['Key'])
    except KeyError:
        print("No files exist")
        exit(1)

Upvotes: 3

wikier
wikier

Reputation: 2565

You'd need to use paginators:

import boto3 

client = boto3.client("s3")
bucket = "my-bucket"
paginator = client.get_paginator('list_objects')
page_iterator = paginator.paginate(Bucket=bucket)
for page in page_iterator:
    for obj in page['Contents']:
        print(f"s3://{bucket}/{obj["Key"]}")

Upvotes: 6

koolhead17
koolhead17

Reputation: 1964

You can also use minio-py client library, its open source & compatible with AWS S3.

list_objects.py example below, you can refer to the docs for additional information.

from minio import Minio

client = Minio('s3.amazonaws.com',
               access_key='YOUR-ACCESSKEYID',
               secret_key='YOUR-SECRETACCESSKEY')

# List all object paths in bucket that begin with my-prefixname.
objects = client.list_objects('my-bucketname', prefix='my-prefixname',
                              recursive=True)
for obj in objects:
    print(obj.bucket_name, obj.object_name.encode('utf-8'), obj.last_modified,
          obj.etag, obj.size, obj.content_type)

Hope it helps.

Disclaimer: I work for Minio

Upvotes: 1

Borealis
Borealis

Reputation: 8470

Using the higher level API and use resources is the way to go.

import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('location2')
bucket_files = [x.key for x in bucket.objects.all()]

Upvotes: 2

Shimon Tolts
Shimon Tolts

Reputation: 1692

There is no need to use the --recursive option while using the AWS SDK as it lists all the objects in the bucket using the list_objects method.

import boto3 
client = boto3.client('s3')
client.list_objects(Bucket='MyBucket')

Upvotes: 3

Related Questions