lf_celine
lf_celine

Reputation: 673

Send xls file to S3 in python

I created a .xls file in my python program that I try to push to S3 with boto.

My code:

just_models = [{"key": "1"}, {"key2": 2}]

df_just_models = pd.DataFrame(just_models)
# Notice desired file extension.
just_models = df_just_models.to_excel('../file.xls')

def save_just_models():
    # Save to S3
    date_file = datetime.now().strftime("%Y-%m-%d")
    s3 = boto3.resource('s3')
    obj = s3.Object('directory', 'indice_na/just_models' + date_file + '_indice_na.xls')
    obj.put(Body=just_models)
save_just_models()

My error:

    Traceback (most recent call last):
  File "indicena.py", line 11985, in <module>
    save_just_models()
  File "indicena.py", line 11984, in save_just_models
    obj.put(Body=just_models)
  File "/home/bolgi/.local/lib/python3.8/site-packages/boto3/resources/factory.py", line 520, in do_action
    response = action(self, *args, **kwargs)
  File "/home/bolgi/.local/lib/python3.8/site-packages/boto3/resources/action.py", line 83, in __call__
    response = getattr(parent.meta.client, operation_name)(*args, **params)
  File "/home/bolgi/.local/lib/python3.8/site-packages/botocore/client.py", line 316, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/bolgi/.local/lib/python3.8/site-packages/botocore/client.py", line 598, in _make_api_call
    request_dict = self._convert_to_request_dict(
  File "/home/bolgi/.local/lib/python3.8/site-packages/botocore/client.py", line 646, in _convert_to_request_dict
    request_dict = self._serializer.serialize_to_request(
  File "/home/bolgi/.local/lib/python3.8/site-packages/botocore/validate.py", line 297, in serialize_to_request
    raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid type for parameter Body, value: None, type: <class 'NoneType'>, valid types: <class 'bytes'>, <class 'bytearray'>, file-like object

The error came from the obj.put() but I don't know exactly how to resolve it.

Upvotes: 1

Views: 1730

Answers (1)

Marcin
Marcin

Reputation: 238209

Yes, you can write directly to S3 from memory. No need to save the xls file on your local hard drive. This is possible as you can write to BytesIO instead of the file, and the send the BytesIO to S3 as shown below. Also default pandas xlsx writer is deprecated, so you may consider using newer one such as xlsxwriter:

import io

import pandas as pd

just_models = [{"key": "1"}, {"key2": 2}]

df_just_models = pd.DataFrame(just_models)

mem_file = io.BytesIO();

df_just_models.to_excel(mem_file, engine='xlsxwriter')

def save_just_models():
    # Save to S3
    date_file = datetime.now().strftime("%Y-%m-%d")
    s3 = boto3.resource('s3')
    obj = s3.Object('directory', 'indice_na/just_models' + date_file + '_indice_na.xls')
    obj.put(Body=mem_file.getvalue())
    
save_just_models()

Upvotes: 2

Related Questions