Reputation: 461
For whatever reason, I thought it would be neat to create a table of emoji I'm interested in. First column would be the codepoint, second the emoji, third the name. SOmething along the lines of this web page, but tailored to my use.
Assuming I figure out how to iterate on the codepoints (there are other questions for that or I construct a list of interest) then I will just cycle through the code points such as
u_str = u'\U0001F001'
u_str = u'\U0001F002'
(generated programmatically of course)
and print (in a loop):
print(u'\U0001F001', u_str, ' ', unicodedata.name(u_str))
print(u'\U0001F002', u_str, ' ', unicodedata.name(u_str))
If there was an ability to use unicodedata and some attribute such as unicodedata.hex_representation then I would just use that, but if there is that attribute in unicodedata, I don't understand the spec to see it.
So in searching for an answer I found this question:
how-does-one-print-a-unicode-character-code-in-python
I attempt:
>>> print(u_str.encode('raw_unicode_escape'))
b'\\U0001f600'
what I'm looking for is what I put in:
u_str = u'\U0001F600'
Is this possible or is there some other way to achieve the construction of the table?
Upvotes: 5
Views: 3610
Reputation: 178379
Using Python 3.6+:
>>> for i in range(0x1f001,0x1f005):
>>> print(f'U+{i:04X} \\U{i:08X} {chr(i)}')
U+1F001 \U0001F001 π
U+1F002 \U0001F002 π
U+1F003 \U0001F003 π
U+1F004 \U0001F004 π
Upvotes: 6
Reputation: 799490
The original representation is gone forever. The case and formatting are specified by Python itself.
You need to decode your bytes back to text. Try the ascii
codec, since that's all raw_unicode_escape
will generate.
Upvotes: 1