upload a dataframe as a zipped csv directly to s3 without saving it on the local machine

Question

How can I upload a data frame as a zipped csv into S3 bucket without saving it on my local machine first?

I have the connection to that bucket already running using:

self.s3_output = S3(bucket_name='test-bucket', bucket_subfolder='')

Charles Landau · Accepted Answer

We can make a file-like object with BytesIO and zipfile from the standard library.

# 3.7
from io import BytesIO
import zipfile

# .to_csv returns a string when called with no args
s = df.to_csv()

with zipfile.ZipFile(BytesIO(), mode="w",) as z:
  z.writestr("df.csv", s)
  # upload file here

You'll want to refer to upload_fileobj in order to customize how the upload behaves.

yourclass.s3_output.upload_fileobj(z, ...)

upload a dataframe as a zipped csv directly to s3 without saving it on the local machine

Answers (2)

Related Questions