Reactive_learner
Reactive_learner

Reputation: 417

Amazon S3 boto3 how to iterate through objects in a bucket?

In a flask app, I was trying to iterate through objects in a S3 Bucket and trying to print the key/ filename but my_bucket.objects.all() returns only the first object in the bucket. It's not returning the all the objects. The output is [001.pdf] instead of [001, 002, 003, 004, 005]

from flask import Flask, jsonify, Response, request
from flask_cors import CORS, cross_origin
from config import S3_BUCKET, S3_ACCESS_KEY, S3_SECRET_ACCESS_KEY

import boto3
import csv
import re


s3 = boto3.client(
    's3',
    aws_access_key_id=S3_ACCESS_KEY,
    aws_secret_access_key=S3_SECRET_ACCESS_KEY
)

app = Flask(__name__)
CORS(app, supports_credentials=True)


@app.route('/')
def health():
    return jsonify({"message": "app is working"})


@app.route('/files')
def list_of_files():
    s3_resource = boto3.resource('s3')
    my_bucket = s3_resource.Bucket(S3_BUCKET)
    summaries = my_bucket.objects.all()
    files = []
    for file in summaries:
        # this prints the bucket object
        print("Object: {}".format(summaries))
        files.append(file.key)
        # file.key is supposed to return the names of the list of objects
        # print(file.key)
        return jsonify({"files":"{}".format(file.key)})




if __name__ == "__main__":
    app.run()

Upvotes: 4

Views: 16965

Answers (2)

Michael Behrens
Michael Behrens

Reputation: 1157

Here is another version based on @franklinsijo's. This expects the S3_BUCKET variable to be assigned to your bucket name and also that you have access to your AWS account from the shell. It also gains access to the S3 resource outside of the function so it can be reused if needed. It specifically prints objects that are in the images prefix. Change this as needed and then append the found key as needed to a list, etc.

import boto3

S3_BUCKET="my-bucket"
s3_resource = boto3.resource('s3')

def list_of_files():
    my_bucket = s3_resource.Bucket(S3_BUCKET)
    objects = my_bucket.objects.filter(Prefix="images")
    for object in objects:
        print(object.key)

list_of_files()

Upvotes: 0

franklinsijo
franklinsijo

Reputation: 18270

You are exiting the loop by returning too early.

def list_of_files():
    s3_resource = boto3.resource('s3')
    my_bucket = s3_resource.Bucket(S3_BUCKET)
    summaries = my_bucket.objects.all()
    files = []
    for file in summaries:
        files.append(file.key)
    return jsonify({"files": files})

Upvotes: 6

Related Questions