D Tee
D Tee

Reputation: 13

Unicode, Bytes, and String to Integer Conversion

I am writing a program that is dealing with letters from a foreign alphabet. The program is taking the input of a number that is associated with the unicode number for a character. For example 062A is the number assigned in unicode for that character.

I first ask the user to input a number that corresponds to a specific letter, i.e. 062A. I am now attempting to turn that number into a 16-bit integer that can be decoded by python to print the character back to the user.

example:

for \u0394

print(bytes([0x94, 0x03]).decode('utf-16'))

however when I am using

int('062A', '16')

I receive this error:

ValueError: invalid literal for int() with base 10: '062A'

I know it is because I am using A in the string, however that is the unicode for the symbol. Can anyone help me?

Upvotes: 0

Views: 192

Answers (1)

Karl Knechtel
Karl Knechtel

Reputation: 61617

however when I am using int('062A', '16'), I receive this error: ValueError: invalid literal for int() with base 10: '062A'

No, you aren't:

>>> int('062A', '16')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object cannot be interpreted as an integer

It's exactly as it says. The problem is not the '062A', but the '16'. The base should be specified directly as an integer, not a string:

>>> int('062A', 16)
1578

If you want to get the corresponding numbered Unicode code point, then converting through bytes and UTF-16 is too much work. Just directly ask using chr, for example:

>>> chr(int('0394', 16))
'Δ'

Upvotes: 1

Related Questions