WhatIsMyName
WhatIsMyName

Reputation: 19

Where is "\u202f" in printed string?

I have a file test.tsv contains special symbol "\u202f": special symbol

When i wrote a python script to read this file, found that .readline() and read this symbol, and .read() can not read it.

And when i print lines1[0], "\u202f" disappeared.

Why ?

Code:

ff = "test.tsv"

lines1 = open(ff, encoding='utf-8').readlines()
str1 = open(ff, encoding='utf-8').read()

print("lines1:", lines1)
print("lines1[0]:", lines1[0])
print("str1:", str1)

Output:

lines1: ['assume Fourbooks\u202f è una piattaforma\n']
lines1[0]: assume Fourbooks  è una piattaforma

str1: assume Fourbooks  è una piattaforma

Upvotes: 1

Views: 3177

Answers (1)

ClassHacker
ClassHacker

Reputation: 394

First of all both readline() and read() are reading your special character.

The readline() reads each line as it is present in the file and append it to the list, on the other hand read() reads all the content of your file and save it is as a string.

If you see your output closely, you will notice that while printing lines1 you are getting \u202f as a text only not as a evaluated value. But when you are printing lines1[0] and str1, your special character is getting printed but this time it's value is getting evaluated which is a whitespace.

The actual reason behind the difference in the output is that the __repr__ function is being called (through the list, on line print(lines1), and in the other, the __str__ function is being called (by the str object itself, on lines print(lines1[0]) and print(str1)) as mentioned in the comments by MZ

Upvotes: 2

Related Questions