user10734089
user10734089

Reputation:

Python 3 print utf-8 encoded string problem

I'm requesting a string from a network-service. When I print it from within a program:

variable = getFromNetwork()
print(variable)

and I execute it using python3 net.py I get:

\xd8\xaa\xd9\x85\xd9\x84\xd9\x8a612

When I execute in the python3 CLI:

>>> print("\xd8\xaa\xd9\x85\xd9\x84\xd9\x8a612")
تÙ
Ù
Ù612

Buy when I execute in the python2 CLI I get the correct result:

>>> print("\xd8\xaa\xd9\x85\xd9\x84\xd9\x8a612")
تملي612

How I can print this in my program by python3?

Edit

After executing the following line:

print(print(type(variable), repr(variable)))

Got

<class 'str'> '\\xd8\\xaa\\xd9\\x85\\xd9\\x84\\xd9\\x8a612'

I think I should first remove\\x to make it hex and then decode it. What is your solutions!?

Upvotes: 4

Views: 947

Answers (3)

safiqul islam
safiqul islam

Reputation: 650

in python 3 i tested with the following code

    line='\xd8\xaa\xd9\x85\xd9\x84\xd9\x8a612'
    line = line.encode('raw_unicode_escape')
    line=line.decode("utf-8")
    print(line)

it prints

تملي612

Upvotes: 0

Serge Ballesta
Serge Ballesta

Reputation: 148880

Your variable is a (unicode) string that contains code for a UTF8 encoded byte string. It can happen because it was erroneously decoded with a wrong encoding (probably Latin1 here).

You can fix it by first converting to a byte string without changing the codes (so with a Latin1 encoding) and then you will be able to correctly decode it:

variable = getFromNetwork().encode('Latin1').decode()
print(variable)

Demo:

variable = "\xd8\xaa\xd9\x85\xd9\x84\xd9\x8a612"
print(variable.encode('Latin1').decode())

تملي612

Upvotes: 2

Maurice Meyer
Maurice Meyer

Reputation: 18106

You need to specify the encoding, so the interpreter knows how to interpret the data:

s = "\xd8\xaa\xd9\x85\xd9\x84\xd9\x8a612"
y = s.encode('raw_unicode_escape')
print (y)  # is a bytes object now!
print (y.decode('utf-8'))

Out:

b'\xd8\xaa\xd9\x85\xd9\x84\xd9\x8a612'
تملي612

Upvotes: 3

Related Questions