Reputation: 283
how to decode this string present in utf-16 using python3
"b'\\xff\\xfeS\\x00H\\x00A\\x00D\\x00E\\x00K\\x00 \\x00D\\x00E\\x00E\\x00E\\x00P\\x00'"
tried this but getting the error TypeError: a bytes-like object is required, not 'str'
a.rstrip("\n").decode("utf-16")
Upvotes: 1
Views: 941
Reputation: 55669
You have a string which has been created by calling str
on encoded text (a bytes instance). Like this:
>>> s = 'abc'
>>> bs = s.encode('utf-16')
>>> bs
b'\xff\xfea\x00b\x00c\x00'
>>> str(bs)
"b'\\xff\\xfea\\x00b\\x00c\\x00'" # <- the 'b' is *inside* the outer quotes
The bytes can be recovered by calling ast.literal_eval on the string, and then the bytes may be decoded back to a sring by calling their decode method.
>>> import ast
>>> s = "b'\\xff\\xfeS\\x00H\\x00A\\x00D\\x00E\\x00K\\x00 \\x00D\\x00E\\x00E\\x00E\\x00P\\x00'"
>>> bs = ast.literal_eval(s)
>>> bs
b'\xff\xfeS\x00H\x00A\x00D\x00E\x00K\x00 \x00D\x00E\x00E\x00E\x00P\x00'
>>> original = bs.decode('utf-16')
>>> original
'SHADEK DEEEP'
This is a workaround. The correct solution is to prevent str
being called on the bytes instance in the first place.
Upvotes: 1
Reputation: 2806
Consider you can edit this text, change it into this:
r = b'\xff\xfeS\x00H\x00A\x00D\x00E\x00K\x00 \x00D\x00E\x00E\x00E\x00P\x00'
print(r.decode('utf-16')) # SHADEK DEEEP
Notice the change between
"b'\\xff\\xfeS\\x00H\\x00A\\x00D\\x00E\\x00K\\x00 \\x00D\\x00E\\x00E\\x00E\\x00P\\x00'"
b'\\xff\\xfeS\\x00H\\x00A\\x00D\\x00E\\x00K\\x00 \\x00D\\x00E\\x00E\\x00E\\x00P\\x00'
b'\xff\xfeS\x00H\x00A\x00D\x00E\x00K\x00 \x00D\x00E\x00E\x00E\x00P\x00'
Upvotes: 2
Reputation: 1645
You seem to have some extra "
in the beginning and end of what you want to decode.
This works fine for me:
>>> b'\\xff\\xfeS\\x00H\\x00A\\x00D\\x00E\\x00K\\x00 \\x00D\\x00E\\x00E\\x00E\\x00P\\x00'.decode('utf-16')
>>> '硜晦硜敦屓へ䠰硜〰屁へ䐰硜〰居へ䬰硜〰尠へ䐰硜〰居へ䔰硜〰居へ倰硜〰'
Update:
As Reznik suggested, you should deleted extra \
characters.
Upvotes: 0