Reputation: 407
I am trying to understand and learn how to get all my files from the specific bucket into one csv file. I have the files that are like logs and are always in the same format and are kept in the same bucket. I have this code to access them and read them:
bucket = s3_resource.Bucket(bucket_name)
for obj in bucket.objects.all():
x = obj.get()['Body'].read().decode('utf-8')
print(x)
It does print them with separation between specific files and also column headers.
The question I have got is, how can I modify my loop to get them into just one csv file?
Upvotes: 0
Views: 4609
Reputation: 269101
You should create a file in /tmp/
and write
the contents of each object into that file.
Then, when all files have been read, upload the file (or do whatever you want to do with it).
output = open('/tmp/outfile.txt', 'w')
bucket = s3_resource.Bucket(bucket_name)
for obj in bucket.objects.all():
output.write(obj.get()['Body'].read().decode('utf-8'))
output.close
Please note that there is a limit of 512MB in the /tmp/
directory.
Upvotes: 2