Reputation: 3667
I have a bucket with various files. I am only interested in pulling files that begin with the word 'member' and storing each member file in a list to be concated further into a dataframe.
Currently I am pulling data like this:
import boto3
my_bucket = s3.Bucket('my-bucket')
obj = s3.Object('my-bucket','member')
file_content = obj.get()['Body'].read().decode('utf-8')
df = pd.read_csv(file_content)
How ever this is only pulling the member file. I have member files that look like this 'member_1229013','member_2321903'
etc.
How can I read in all the 'member' files, save the data in a list so I can concat later. All column names are the same in all csv's
Upvotes: 1
Views: 234
Reputation: 269101
You can only download/access one object per API call.
I normally recommend downloading the objects to a local directory, and then accessing them as normal local files. Here is an example of how to download an object from Amazon S3:
import boto3
s3 = boto3.client('s3')
s3.download_file('mybucket', 'hello.txt', '/tmp/hello.txt')
See: download_file()
documentation
If you want to read multiple files, you will first need to obtain a listing of the files (eg with list_objects_v2()
, and then access each object individually.
One tip for boto3... There are two ways to make calls: via a Resource (eg using s3.Object()
or s3.Bucket()
) or via a Client, which passes everything as parameters.
Upvotes: 1