Phil
Phil

Reputation: 666

Read .txt with emoji characters in python

I try to read a chat history with smilies in it, but I get the following error:

UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 38: character maps to

My code looks like this:

file_name = "chat_file.txt"
chat = open(chat_file)
chatText = chat.read() # read data
chat.close()
print(chatText)

I am pretty certain that it's because of elements like: ❤

How can I implement the correct Transformation Format // what is the correct file encoding so python can read these elements?

Upvotes: 1

Views: 3352

Answers (1)

Tomalak
Tomalak

Reputation: 338326

Never open text files without specifying their encoding.

Also, use with blocks, these automatically call .close() so you don't have to.

file_name = "chat_file.txt"

with open(chat_file, encoding="utf8") as chat:
    chat_text = chat.read()

print(chat_text)

iso-8859-1 is a legacy encoding, that means it cannot contain emoji. For emoji the text file has to be Unicode. And the most common encoding for Unicode is UTF-8.

Upvotes: 8

Related Questions