Reputation: 666
I try to read a chat history with smilies in it, but I get the following error:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 38: character maps to
My code looks like this:
file_name = "chat_file.txt"
chat = open(chat_file)
chatText = chat.read() # read data
chat.close()
print(chatText)
I am pretty certain that it's because of elements like: ❤
How can I implement the correct Transformation Format // what is the correct file encoding so python can read these elements?
Upvotes: 1
Views: 3352
Reputation: 338326
Never open text files without specifying their encoding.
Also, use with
blocks, these automatically call .close()
so you don't have to.
file_name = "chat_file.txt"
with open(chat_file, encoding="utf8") as chat:
chat_text = chat.read()
print(chat_text)
iso-8859-1
is a legacy encoding, that means it cannot contain emoji. For emoji the text file has to be Unicode. And the most common encoding for Unicode is UTF-8
.
Upvotes: 8