dijxtra
dijxtra

Reputation: 2751

fixing UnicodeEncodeError: 'charmap' codec can't encode character in python3

I have a file input.txt which contains only one line: "obra鑾n". How I got that kanji there is a separate problem which I do not wish to address here, here I'd just like to print content of that file on command line using Python 3 on Windows (I do not have this issue on Linux).

I've been googling this issue for an hour now and have lost my mind without figuring out the solution. Here is how far I got:

# -*- coding: utf-8 -*-
f = open("input.txt", encoding='utf8')
s = f.read()

print(type(s))
#print(s) #error
b = s.encode('utf-8')
print(type(b))
print(b)
#print(b.decode("utf-8")) #error
#print(b.decode('unicode_escape')) #error

The output of this code is:

<class 'str'>
<class 'bytes'>
b'obra\xe9\x91\xben\n'

Error on first two commented lines is identical:

UnicodeEncodeError: 'charmap' codec can't encode character '\u947e' in position 4: character maps to <undefined>

Error on the last commented line is:

UnicodeEncodeError: 'charmap' codec can't encode character in position 5-6: character maps to <undefined>

I have now run our of ideas what to do. Any ideas on how to print the content of this file on Windows command line using Python 3?

Thanks.

Upvotes: 1

Views: 1829

Answers (1)

Mark Tolonen
Mark Tolonen

Reputation: 177725

The Windows command line normally doesn't have a font that supports Asian characters unless your system locale is an Asian locale. Your system locale can be changed in Control Panel, Region and Language, Administrative tab (Windows 7).

Otherwise, you can try win-unicode-console, but you will still need to find a fixed-width console font that supports Asian characters.

Installing console fonts

Upvotes: 2

Related Questions