Python C-API: How to pass an UNICODE UTF-16 null terminated C string to my python app without converting to UTF-8?

Question

Pythonistas,

I'm trying to write a Python extension in C that passes a big amount of null terminated, UNICODE UTF-16 encoded C strings to my Python application. The UNICODE strings from my C library are guarenteed to be always 16 bit. I'm NOT using the wchar_t in my C library on LINUX due to the fact that the size of wchar_t may vary.

I found a lot of functions (PyUnicode_AsUTF8String, PyString_FromStringAndSize, PyString_FromString, etc.) that do exactly what i want but all theses functions are designed for 8 bit character/string representation.

The Python documentation (http://docs.python.org/howto/unicode.html) says:

"Under the hood, Python represents Unicode strings as either 16- or 32-bit integers, depending on how the Python interpreter was compiled."

I'm really keen to avoid the performance penalty of converting all my UTF-16 C strings to UTF-8 C strings only for Python interface purposes, especially on Windows if the Python interpreter uses 16 bit "under the hood" as well.

Any idea how to tackle this challenge is highly appreciated.

Thanks, Thomas

Python C-API: How to pass an UNICODE UTF-16 null terminated C string to my python app without converting to UTF-8?

Answers (1)

Related Questions