lf_celine
lf_celine

Reputation: 673

Encoding json in utf-8 and upload to Amazon S3

I have a python script that export a json file to Amazon S3. It works but the problem is that I don't know how to use 'UTF-8'.

My code:

import boto3
s3 = boto3.resource('s3')
...
obj = s3.Object('ccvdb', 'file.json')
obj.put(Body=json.dumps(data),
    ContentType='charset=utf-8')

Output:

{"objectID": 10202, "type": "Coup\u00e9", "cars_getroute": "alfa-romeo-giulia-sprint-gt-veloce-bertone-coupe-1965-1967"}...

How to use utf-8 encoding in S3 with boto3 ?

EDIT: I found out the solution and replace the last line by obj.put(Body=json.dumps(data, indent=4, ensure_ascii=False).encode('utf8'))

Upvotes: 2

Views: 6468

Answers (1)

Niobos
Niobos

Reputation: 910

According to the boto3 documentation, this shouldn't even work. At least not in Python3+. The put() call requires Body= to be of type bytes, while json.dumps() outputs a str. (Python2 wasn't so strict about bytes vs str)

UTF-8 is a way to convert a str (a series of characters) to bytes. You can simply call whatever_string.encode('utf-8') to do that conversion for you. In your code, simply add a .encode('utf-c') after your json.dumps() should do the trick.

Upvotes: 1

Related Questions