Python 2.7: Names of unicode representations

Question

What are the names for these different kinds of ascii representations of unicode?

\xF0\x9F\x98\xA2
\U0001f622

And is there a term for the set that they belong to that's more specific than "representation"? And in the context of these, how would I describe the non-ascii representation (😢)?

Since I don't know what to call them it is very hard to search for how to work with them.

Thanks!

Mantas Kandratavičius · Accepted Answer

As Tom Blodget already warned you, this is a somewhat python specific answer.

The leading \ shows that it's an escape sequence.

\x means that the next two characters will be interpreted as a hex digit.

\U means that the next eight characters will be interpreted as a 32-bit hex value.

You can read more about that here:

https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals

To fully answer your question:

\xF0\x9F\x98\xA2 are simply four ASCII characters and you have their hex values
\U0001f622 is a UNICODE codepoint encoded with a 32-bit hex value
😢 is a glyph or simply a special character.

Python 2.7: Names of unicode representations

Answers (2)

Related Questions