mcwizard
mcwizard

Reputation: 461

In Python 3 how to print unicode codepoint as u'\U...'

For whatever reason, I thought it would be neat to create a table of emoji I'm interested in. First column would be the codepoint, second the emoji, third the name. SOmething along the lines of this web page, but tailored to my use.

Full emoji data

Assuming I figure out how to iterate on the codepoints (there are other questions for that or I construct a list of interest) then I will just cycle through the code points such as

u_str = u'\U0001F001'
u_str = u'\U0001F002'

(generated programmatically of course)

and print (in a loop):

print(u'\U0001F001', u_str, ' ', unicodedata.name(u_str))
print(u'\U0001F002', u_str, ' ', unicodedata.name(u_str))

If there was an ability to use unicodedata and some attribute such as unicodedata.hex_representation then I would just use that, but if there is that attribute in unicodedata, I don't understand the spec to see it.

So in searching for an answer I found this question:

how-does-one-print-a-unicode-character-code-in-python

I attempt:

>>> print(u_str.encode('raw_unicode_escape'))
b'\\U0001f600'

what I'm looking for is what I put in:

u_str = u'\U0001F600'

Is this possible or is there some other way to achieve the construction of the table?

Upvotes: 5

Views: 3610

Answers (2)

Mark Tolonen
Mark Tolonen

Reputation: 178379

Using Python 3.6+:

>>> for i in range(0x1f001,0x1f005):
>>>     print(f'U+{i:04X} \\U{i:08X} {chr(i)}')
U+1F001 \U0001F001 πŸ€
U+1F002 \U0001F002 πŸ€‚
U+1F003 \U0001F003 πŸ€ƒ
U+1F004 \U0001F004 πŸ€„

Upvotes: 6

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 799490

  1. The original representation is gone forever. The case and formatting are specified by Python itself.

  2. You need to decode your bytes back to text. Try the ascii codec, since that's all raw_unicode_escape will generate.

Upvotes: 1

Related Questions