Dr. John James Cobra
Dr. John James Cobra

Reputation: 216

Python: Convert Integer to UTF16-LE

I got an integer value of 29.827 and I want to convert this into the Unicode Han Character 'glass' (U+7483) (see http://www.fileformat.info/info/unicode/char/7483/index.htm) with UTF-16-LE encoding.

I managed to convert this number into a 3-byte UTF-8 encoding (integers over 2048 have 3byte in UTF-8..) with

s ='\u%s'%hex(int_to_encode)[2:]
file.write(s.decode('unicode-escape').encode('utf-8'))
file.close()

But I figured out the needed encoding is UTF-16-LE. In the intended encoding, an integer representation also has 3 bytes(this is why I thought my first try was correct, also 3 bytes for one integer...)

Thanks a lot for your Help,

Kind regards

Upvotes: 0

Views: 2511

Answers (1)

Duncan
Duncan

Reputation: 95712

First of all to convert a number to a character use chr() (Python3), or unichr() (Python2). Then to encode using UTF-16-LE you simply specify that encoding rather than specifying UTF-8.

So Python 2:

int_to_encode = 0x7483
s = unichr(int_to_encode)
file.write(s.encode('utf-16-le'))
file.close()

In either Python 2 or Python 3 you can specify the file encoding when you open it:

import io
s = unichr(0x7483)
with io.open('foo', 'w', encoding='utf-16-le') as f:
    f.write(s)

Console session to show this:

>>> with io.open('foo', 'w', encoding='utf-16-le') as f:
...     f.write(unichr(0x7483))
...
1L
>>> with io.open('foo', 'r', encoding='utf-16-le') as f:
...     print(f.read())
...
璃

Upvotes: 2

Related Questions