Reputation: 673
I have a python script that export a json file to Amazon S3. It works but the problem is that I don't know how to use 'UTF-8'.
My code:
import boto3
s3 = boto3.resource('s3')
...
obj = s3.Object('ccvdb', 'file.json')
obj.put(Body=json.dumps(data),
ContentType='charset=utf-8')
Output:
{"objectID": 10202, "type": "Coup\u00e9", "cars_getroute": "alfa-romeo-giulia-sprint-gt-veloce-bertone-coupe-1965-1967"}...
How to use utf-8 encoding in S3 with boto3 ?
EDIT: I found out the solution and replace the last line by obj.put(Body=json.dumps(data, indent=4, ensure_ascii=False).encode('utf8'))
Upvotes: 2
Views: 6468
Reputation: 910
According to the boto3 documentation, this shouldn't even work. At least not in Python3+. The put()
call requires Body=
to be of type bytes
, while json.dumps()
outputs a str
. (Python2 wasn't so strict about bytes
vs str
)
UTF-8 is a way to convert a str
(a series of characters) to bytes
. You can simply call whatever_string.encode('utf-8')
to do that conversion for you. In your code, simply add a .encode('utf-c')
after your json.dumps()
should do the trick.
Upvotes: 1