sk1pro99
sk1pro99

Reputation: 987

String Conversion python

I have a small question on string conversion in python3.

s = '\x001\x002\x001\x000\x005\x005\x000\x004\x000\x000\x00'

print(s) -> gives the output :

1 2 1 0 5 5 0 4 0 0

However, when I try to convert the string using the following:

bytes(s, 'utf16').decode('utf16') , I get '\x001\x002\x001\x000\x005\x005\x000\x004\x000\x000\x00'.

What is the way to get the same output as print(s) programmatically?

Upvotes: 1

Views: 336

Answers (2)

Giacomo Catenazzi
Giacomo Catenazzi

Reputation: 9533

On first example, you print the string s, and console will ignore the \x00. You do a print(s).

On you last line, you get the string from python prompt. If you print it: print(bytes(s,'utf-16').decode('utf-16')), you get what you want.

So Python prompt show you to the variable, with context (e.g. you see also the ' signs), but not the real representation of the string (which do you have with print).

ADDENDUM:

print will print the string in its argument, eventually calling str() to convert the argument to string. But python prompt will print the representation of the variable (given with repr(). So you can print(repr(bytes(s,'utf-16').decode('utf-16'))) to get the same string you get in python interactive session, but as string. Instead of printing, you can assign such function (r = repr(bytes(...).decode(...)), so you have r[0] is ', r[1] is \, etc.

Upvotes: 2

Sohaib Anwaar
Sohaib Anwaar

Reputation: 1547

You just need to decode this binary and you will get the answer

x = b'\x001\x002\x001\x000\x005\x005\x000\x004\x000\x000\x00'
str1 = x.decode('utf-8')
print(" ".join([i for i in str1 if ord(i) != 0]))

Second Solution:

x = '1 2 1 0 5 5 0 4 0 0'
str_utf32 = x.encode('utf16')
print("Encoding :",str_utf32)
print("Decoding :",str_utf32.decode('utf16'))

output

Encoding : b'\xff\xfe1\x00 \x002\x00 \x001\x00 \x000\x00 \x005\x00 \x005\x00 \x000\x00 \x004\x00 \x000\x00 \x000\x00'
Decoding : 1 2 1 0 5 5 0 4 0 0

Upvotes: 1

Related Questions