Reputation: 1281

Get the character that a Unicode code point corresponds

For a Computer Science class, we've got to make a python program that converts a character into it's Unicode Codepoint (the bin/hex number which is the reference to the character). Is there a function out there which can do this, like how the ord() function converts to ASCII and is there a function which does the reverse, turning a Unicode codepoint into a character?

Thanks

Upvotes: 2

Answers (2)

Olivier Melançon

Reputation: 22314

The builtin function ord also works for unicode characters both in Python2 and Python3.

Python 3

>>> c='\U0010ffff'
>>> ord(c)
1114111

Python 2

>>> c=u'\U0010ffff'
>>> ord(c)
1114111

Difference between Python 2 and 3

The difference between Python 2 and Python 3 is when you go the other way around.

In Python 3, the function chr can take any code, ascii or unicode, and outputs the character.

In Python 2, the function chr is for extended ascii (code 0 to 255) and the function unichr is for unicode.

This is due to the fact that in Python 2, unicode and ascii strings were two different types.

Hexadecimal

If you need to get the character code in hexadecimal, you can use hex.

>>> hex(1114111)
'0x10ffff'

Binary

If you need to get the character in binary, you can use bin.

>>> bin(1114111)
'0b100001111111111111111'

Upvotes: 1

jdhao

Reputation: 28359

In Python3, if you know the unicode code point of a character, for example, 我 with Unicode code point \u6211, you can get the character with:

chr(0x6211)