Files uploaded to s3 are missing content

Question

I am trying to upload a series of files to s3 from a pandas data frame. The code I am using to do that is shown below

import os

import boto3
import pandas as pd
from botocore.exceptions import NoCredentialsError

ACCESS_KEY = os.environ['KEY_ID']
SECRET_KEY = os.environ['SECRET_KEY']


def upload_to_aws(local_file, bucket, s3_file):
    s3 = boto3.client('s3', aws_access_key_id=ACCESS_KEY,
                      aws_secret_access_key=SECRET_KEY)

    try:
        s3.upload_file(local_file, bucket, s3_file)
        print("Upload Successful")
        return True
    except NoCredentialsError:
        print("Credentials not available")
        return False
    except FileNotFoundError:
        print("The file was not found")
        return False


df = pd.read_csv('file.csv')
description = df.description.iloc[50]
text_file = open(f"textfile.txt", "w")
text = text_file.write(description)
upload_to_aws("textfile.txt",'bucket-name',"test.txt")
text_file.close()

I am grabbing an element of the data frame that is stored as a string and writing it to a text file. That file is created locally without problems however the versions on s3 show up with no content.

What about my code is causing this issue and how can I make sure the content is showing up? If there is a smarter way to approach this I would love to know.

antont · Accepted Answer

You need the close the file first so that the data is written to the file system.

with open(f"textfile.txt", "w") as text_file:
    text_file.write(description)

#now the with block ends and calls close() on the file and it's written to disk    
upload_to_aws("textfile.txt",'bucket-name',"test.txt")

It can be done with flush() also if you'd want to keep the file open to write more but you don't need that here.

Files uploaded to s3 are missing content

Answers (1)

Related Questions