Anbarasu
Anbarasu

Reputation: 609

Why does printing a file with unicode characters does not produce the emojis?

the content of the text file is

u'\u26be\u26be\u26be'

When I run the script...

import codecs
f1 = codecs.open("test1.txt", "r", "utf-8")
text = f1.read()
print text
str1 = u'\u26be\u26be\u26be'
print(str1)

I get the output...

u'\u26be\u26be\u26be'
⚾⚾⚾

Question: Why is that a string, which the same content as the file, is able to produce the emojis properly?

Upvotes: 2

Views: 209

Answers (2)

2tim
2tim

Reputation: 223

If the input file is required to have unicode escapes you will need to filter it like so:

with open("test1.txt", "r") as f1:
    text = f1.read()
    print unicode(text, 'unicode_escape')
    str1 = u'\u26be\u26be\u26be'
    print(str1)

No need to import other libraries.

Upvotes: 2

falsetru
falsetru

Reputation: 369304

File content u'\u26be\u26be\u26be' is like r"u'\u26be\u26be\u26be'". In other words, characters of u, \, u, 2, ...

You can convert such string to the string ⚾⚾⚾ using ast.literal_eval:

import ast
import codecs

with codecs.open("test1.txt", "r", "utf-8") as f1:
    text = ast.literal_eval(f1.read())
print text
...

But, why does the file contain such string (u'\u26be\u26be\u26be') instead of ⚾⚾⚾? Maybe you need to consider redesigning file saving part.

Upvotes: 8

Related Questions