Reputation: 2346
I am trying to make an in-memory zip file, which contains a bunch of JSON files. I am struggling to upload it to S3 as a file object, receiving a rather strange error. Here is my code:
import boto3
import zipfile
import json
import os
session = boto3.session.Session(
aws_access_key_id=os.environ.get('AWS_ACCESS_KEY_ID'),
aws_secret_access_key=os.environ.get('AWS_SECRET_ACCESS_KEY'))
client = session.client('s3')
data = {'test1.json': {'a': 1, 'b': 2},
'test2.json': {'x': 3, 'y': 4}}
zip_buffer = BytesIO()
zf = zipfile.ZipFile(zip_buffer, 'w')
for filename, d in data.iteritems():
zf.writestr(filename, json.dumps(d, indent=4))
client.upload_fileobj(zf, os.environ.get('S3_BUCKET'), 'test_zip.zip')
This gives me:
KeyError: 'There is no item named 8388608 in the archive'
How and why is this happening? Of course there is no item 8388608
in the archive - I haven't put it there.
EDIT
If I save the file locally instead of in-memory and then re-open it, it works fine. Should I be using tempfile
perhaps?
Upvotes: 3
Views: 3703
Reputation: 2346
The issue was a rather weird one. Firstly, it is the zip_buffer
that needs to be passed, not zf
. But, you need to make sure to close the zipfile object first, otherwise this will result in corrupted zip files that cannot be opened.
import boto3
import zipfile
import json
import os
session = boto3.session.Session(
aws_access_key_id=os.environ.get('AWS_ACCESS_KEY_ID'),
aws_secret_access_key=os.environ.get('AWS_SECRET_ACCESS_KEY'))
client = session.client('s3')
data = {'test1.json': {'a': 1, 'b': 2},
'test2.json': {'x': 3, 'y': 4}}
zip_buffer = BytesIO()
zf = zipfile.ZipFile(zip_buffer, 'w')
for filename, d in data.iteritems():
zf.writestr(filename, json.dumps(d, indent=4))
zf.close() # important!
zip_buffer.seek(0)
client.upload_fileobj(zip_buffer, os.environ.get('S3_BUCKET'), 'test_zip.zip')
Upvotes: 13