Diego Toro
Diego Toro

Reputation: 33

Preserve unicode of emojis in python

I'm dealing with emojis Unicode and wanna save images with its corresponding Unicode like 1F636_200D_1F32B_FE0F for https://emojipedia.org/face-in-clouds/.

But for https://emojipedia.org/keycap-digit-one/ the files end up 1_FE0F_20E3 and I need them to be 0031_FE0F_20E3 is there a way to tell the encoder to not parse the 1?

>>> '1️⃣'.encode('unicode-escape').decode('ascii')
'1\\ufe0f\\u20e3'

Upvotes: 0

Views: 308

Answers (1)

Mark Tolonen
Mark Tolonen

Reputation: 178115

The unicode_escape codec displays the ASCII characters as characters, and only non-ASCII characters as escape codes. If you want all to be escape codes, you have to format yourself:

>>> ''.join([f'\\u{ord(c):04x}' if ord(c) < 0x10000 else f'\\U{ord(c):08x}' for c in '😶‍🌫️'])
'\\U0001f636\\u200d\\U0001f32b\\ufe0f'
>>> ''.join([f'\\u{ord(c):04x}' if ord(c) < 0x10000 else f'\\U{ord(c):08x}' for c in '1️⃣'])
'\\u0031\\ufe0f\\u20e3'

Or maybe you want this format?

>>> '_'.join([f'{ord(c):04X}' for c in '1️⃣'])
'0031_FE0F_20E3'
>>> '_'.join([f'{ord(c):04X}' for c in '😶‍🌫️'])
'1F636_200D_1F32B_FE0F'

Upvotes: 2

Related Questions