zsh_18
zsh_18

Reputation: 1202

How to get an excel file from s3 bucket and upload the file again to s3 bucket without using pandas - Python

I want to get an excel file with s3.get_object function and upload the file back to a temp location in the s3 bucket through s3.put_object function. I do not want to use pandas library or do not want to create a pandas dataframe in between this process to do so.

The code I used so far is:

s3 = boto3.client('s3')
bucket = 'mybucket'
key_obj = 'name of the file.xlsx'


file_obj = s3.get_object(Bucket = bucket, Key = key_obj)
file_con = file_obj.read().decode('utf-8') 
file_data = io.StringIO(file_con)


s3.put_object(file_data, Bucket=bucket, 'tmp' + '\filename.xlsx')

This code is not able to read the excel file properly. With this I got an error:


UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 15-16: invalid continuation byte

when I changed the decoding from 'utf-8' to 'ISO-8859-1'. This error was dropped but the file which was written in the tmp folder was not in readable format and was not even opening up.

Pls suggest . Thanks

Upvotes: 2

Views: 3358

Answers (1)

Marcin
Marcin

Reputation: 238967

Your code is full of mistakes, thus its not clear what you really want to do. But to make it a valid python code which correctly downloads and uploads a file, it should be:

s3 = boto3.client('s3')

bucket = 'mybucket'
key_obj = 'name of the file.xlsx'

file_obj = s3.get_object(Bucket = bucket, Key=key_obj)
file_con = file_obj['Body'].read()
file_data = io.BytesIO(file_con)

s3.put_object(Body=file_data, Bucket=bucket, Key='tmp' + '/filename.xlsx')

Upvotes: 4

Related Questions