Reputation: 26326
Python 3.6
I converted a string from utf8 to this:
b'\xe6\x88\x91\xe6\xb2\xa1\xe6\x9c\x89\xe7\x94\[email protected]'
I now want that chunk of ascii back into string form, so there is no longer the little b for bytes at the beginning.
BUT I don't want it converted back to UTF8, I want that same sequence of characters that you ses above in my Python string.
How can I do so? All I can find are ways of converting bytes to string along with encoding or decoding.
Upvotes: 0
Views: 3636
Reputation:
The (wrong) answer is quite simple:
chr(asciiCode)
In your special case:
myString = ""
for char in b'\xe6\x88\x91\xe6\xb2\xa1\xe6\x9c\x89\xe7\x94\[email protected]':
myString+=chr(char)
print(myString)
gives:
æ没æçµ@xn--ssdcsrs-2e1xt16k.com.au
Maybe you are also interested in the right answer? It will probably not please you, because it says you have ALWAYS to deal with encoding/decoding ... because myString
is now both UTF-8 and ASCII at the same time (exactly as it already was before you have "converted" it to ASCII).
Notice that how myString
shows up when you print it will depend on the implicit encoding/decoding used by print
.
In other words ...
there is NO WAY to avoid encoding/decoding
but there is a way of doing it a not explicit way.
I suppose that reading my answer provided HERE: Converting UTF-8 (in literal) to Umlaute will help you much in understanding the whole encoding/decoding thing.
Upvotes: 1
Reputation: 31250
What you have there is not ASCII, as it contains for instance the byte \xe6
, which is higher than 127. It's still UTF8.
The representation of the string (with the 'b'
at the start, then a '
, then a '\', ...), that is ASCII. You get it with repr(yourstring)
. But the contents of the string that you're printing is UTF8.
But I don't think you need to turn that back into an UTF8 string, but it may depend on the rest of your code.
Upvotes: 0