Reputation: 64834
The hex string '\xd3'
can also be represented as: Ó
.
The easiest way I've found to get the character representation of the hex string to the console is:
print unichr(ord('\xd3'))
Or in English, convert the hex string to a number, then convert that number to a unicode code point, then finally output that to the screen. This seems like an extra step. Is there an easier way?
Upvotes: 8
Views: 38105
Reputation: 31
Not long ago, I had a very similar problem. I had to decode files that contained unicode hex (e.g., _x0023_
) instead of special characters (e.g., #
). The solution is described in the following code:
from collections import OrderedDict
import re
def decode_hex_unicode_to_latin1(string: str) -> str:
hex_unicodes = list(OrderedDict.fromkeys(re.findall(r'_x[?:\da-zA-Z]{4}_', string)))
for code in hex_unicodes:
char = bytes.fromhex(code[2:-1]).decode("latin1")[-1]
string = string.replace(code, char)
return string
def main() -> None:
string = "|_x0020_C_x00f3_digo_x0020_|"
decoded_string = decode_hex_unicode_to_latin1(string)
print(string, "-->", decoded_string)
return
if __name__ == '__main__':
main()
|_x0020_C_x00f3_digo_x0020_| --> | Código |
Upvotes: 1
Reputation: 1985
if data is something like this "\xe0\xa4\xb9\xe0\xa5\x88\xe0\xa4\xb2\xe0\xa5\x8b \xe0\xa4\x95\xe0\xa4\xb2"
sys.stdout.buffer.write(data)
would print
हैलो कल
Upvotes: 1
Reputation: 176780
print u'\xd3'
Is all you have to do. You just need to somehow tell Python it's a unicode literal; the leading u
does that. It will even work for multiple characters.
If you aren't talking about a literal, but a variable:
codepoints = '\xd3\xd3'
print codepoints.decode("latin-1")
Edit: Specifying a specific encoding when print
ing will not work if it's incompatible with your terminal encoding, so just let print
do encode(sys.stdout.encoding)
automatically. Thanks @ThomasK.
Upvotes: 13