Gingmeister
Gingmeister

Reputation: 272

Error trying to save a Python list to an S3 bucket.

I have tried my best to research this problem including on stack overflow and I just dont understand it. I just want to save the output of a Lambda function to an S3 bucket. But it seems like S3 doesnt like lists as a data type !?

I get an error:

botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid type for parameter Body, value:  type: <class 'list'>, valid
types: <class 'bytes'>, <class 'bytearray'>, file-like object

It seems like a list is not a suitable output type for an S3 bucket? Here is the code I am using:

bucket_name = "output-bucket"
file_name = "output.json"
s3 = boto3.resource('s3')
object = s3.Object(bucket_name, file_name)
object.put(Body=output_sentences)

I think I am just not understanding the way this works...

Upvotes: 8

Views: 9520

Answers (3)

AlanCWR
AlanCWR

Reputation: 131

If you (or anyone else who stumbles across this) didn't want to serialize to json and did want to write a list of strings to s3 as just a text file with newlines, like what you would get with a file and .writelines(), what you would do is

object.put(Body="\n".join(output_sentences))

or, if you prefer boto3's client api,

s3_client = boto3.client("s3")
s3_client.put_object(
    Body="\n".join(output_sentences),
    Key=file_name,
    Bucket=bucket_name,
)

S3 is happy to take a big string as the body of an object, if not a list of strings.

Upvotes: 3

Gingmeister
Gingmeister

Reputation: 272

OK thanks. I managed to do it like this:

s3 = boto3.resource('s3')
object = s3.Object(bucket_name, file_name)
object.put(Body=(bytes(json.dumps(output_data, indent=2).encode('UTF-8'))))

Upvotes: 3

Bruno Lubascher
Bruno Lubascher

Reputation: 2121

It says you can only store bytes or bytearray.

So you need to use pickle to convert your list into bytes.

import pickle

output_sentences = ['this', 'is', 'a', 'sentence']

# Convert your list to bytes
b = pickle.dumps(output_sentences)

# Save you object
object.put(Body=output_sentences)

Once you load your byte object again, you can transform back to a list with:

b = <load from S3>
read_sentence = pickle.loads(b)

Upvotes: 1

Related Questions