Reputation: 479
How can i change character encoding of a string to UTF-8? I am making some execv calls to a python program but python returns the strings with the some characters cut of. I don't know if this a python issue or c issue but i thought if i can change the strings encoding in c and then pass it to python, it should do the trick. So how can i do that?
Thanks.
Upvotes: 0
Views: 2437
Reputation: 56976
There is no such thing as character encoding in C.
A char*
can hold any data, how you interpret the characters is up to you. For instance, printf
will typically dump the characters as they are to the standard output, and if your console interprets those characters as UFT8, they'll appear as such.
If you want to convert between different encodings in the C side, you can have a look at ICU.
If you want to convert between encodings in the Python side, look at http://docs.python.org/howto/unicode.html.
Upvotes: 3
Reputation: 3596
C as a language does not facilitate string encoding. A C string is simply a null-terminated sequence of characters (8-bit signed integers, on most systems).
A wide string (with characters of type wchar_t
, typically 16-bit integers) can also be used to hold larger character values; however, again, C standard library functions and data types are in no way aware of any concept of string encoding.
The answer to your question is to ensure that the strings you're passing into Python are encoded as UTF-8.
In order to help you accomplish that in any detailed capacity, however, you will have to provide more information about how your strings are currently formed, what they contain, and how you're constructing your argument list for exec.
Upvotes: 3