user2950593
user2950593

Reputation: 9627

python json load set encoding to utf-8

I have this code:

keys_file = open("keys.json")
keys = keys_file.read().encode('utf-8')
keys_json = json.loads(keys)
print(keys_json)

There are some none-english characters in keys.json. But as a result I get:

[{'category': 'мбт', 'keys': ['Блендер Philips',
'мультиварка Polaris']}, {'category': 'КБТ', 'keys':
['холод ильник атлант', 'посудомоечная
машина Bosch']}]

what do I do?

Upvotes: 56

Views: 212653

Answers (1)

deceze
deceze

Reputation: 522016

encode means characters to binary. What you want when reading a file is binary to charactersdecode. But really this entire process is way too manual, simply do this:

with open('keys.json', encoding='utf-8') as fh:
    data = json.load(fh)

print(data)

with handles the correct opening and closing of the file, the encoding argument to open ensures the file is read using the correct encoding, and the load call reads directly from the file handle instead of storing a copy of the file contents in memory first.

If this still outputs invalid characters, it means your source encoding isn't UTF-8 or your console/terminal doesn't handle UTF-8.

Upvotes: 143

Related Questions