Reputation: 673
I created a .xls file in my python program that I try to push to S3 with boto.
My code:
just_models = [{"key": "1"}, {"key2": 2}]
df_just_models = pd.DataFrame(just_models)
# Notice desired file extension.
just_models = df_just_models.to_excel('../file.xls')
def save_just_models():
# Save to S3
date_file = datetime.now().strftime("%Y-%m-%d")
s3 = boto3.resource('s3')
obj = s3.Object('directory', 'indice_na/just_models' + date_file + '_indice_na.xls')
obj.put(Body=just_models)
save_just_models()
My error:
Traceback (most recent call last):
File "indicena.py", line 11985, in <module>
save_just_models()
File "indicena.py", line 11984, in save_just_models
obj.put(Body=just_models)
File "/home/bolgi/.local/lib/python3.8/site-packages/boto3/resources/factory.py", line 520, in do_action
response = action(self, *args, **kwargs)
File "/home/bolgi/.local/lib/python3.8/site-packages/boto3/resources/action.py", line 83, in __call__
response = getattr(parent.meta.client, operation_name)(*args, **params)
File "/home/bolgi/.local/lib/python3.8/site-packages/botocore/client.py", line 316, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/bolgi/.local/lib/python3.8/site-packages/botocore/client.py", line 598, in _make_api_call
request_dict = self._convert_to_request_dict(
File "/home/bolgi/.local/lib/python3.8/site-packages/botocore/client.py", line 646, in _convert_to_request_dict
request_dict = self._serializer.serialize_to_request(
File "/home/bolgi/.local/lib/python3.8/site-packages/botocore/validate.py", line 297, in serialize_to_request
raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid type for parameter Body, value: None, type: <class 'NoneType'>, valid types: <class 'bytes'>, <class 'bytearray'>, file-like object
The error came from the obj.put()
but I don't know exactly how to resolve it.
Upvotes: 1
Views: 1730
Reputation: 238209
Yes, you can write directly to S3 from memory. No need to save the xls file on your local hard drive. This is possible as you can write to BytesIO
instead of the file, and the send the BytesIO
to S3 as shown below. Also default pandas xlsx writer is deprecated, so you may consider using newer one such as xlsxwriter
:
import io
import pandas as pd
just_models = [{"key": "1"}, {"key2": 2}]
df_just_models = pd.DataFrame(just_models)
mem_file = io.BytesIO();
df_just_models.to_excel(mem_file, engine='xlsxwriter')
def save_just_models():
# Save to S3
date_file = datetime.now().strftime("%Y-%m-%d")
s3 = boto3.resource('s3')
obj = s3.Object('directory', 'indice_na/just_models' + date_file + '_indice_na.xls')
obj.put(Body=mem_file.getvalue())
save_just_models()
Upvotes: 2