wonnie
wonnie

Reputation: 479

How to change a strings encoding as utf 8 in C

How can i change character encoding of a string to UTF-8? I am making some execv calls to a python program but python returns the strings with the some characters cut of. I don't know if this a python issue or c issue but i thought if i can change the strings encoding in c and then pass it to python, it should do the trick. So how can i do that?

Thanks.

Upvotes: 0

Views: 2437

Answers (2)

Alexandre C.
Alexandre C.

Reputation: 56976

There is no such thing as character encoding in C.

A char* can hold any data, how you interpret the characters is up to you. For instance, printf will typically dump the characters as they are to the standard output, and if your console interprets those characters as UFT8, they'll appear as such.

If you want to convert between different encodings in the C side, you can have a look at ICU.

If you want to convert between encodings in the Python side, look at http://docs.python.org/howto/unicode.html.

Upvotes: 3

kqnr
kqnr

Reputation: 3596

C as a language does not facilitate string encoding. A C string is simply a null-terminated sequence of characters (8-bit signed integers, on most systems).

A wide string (with characters of type wchar_t, typically 16-bit integers) can also be used to hold larger character values; however, again, C standard library functions and data types are in no way aware of any concept of string encoding.

The answer to your question is to ensure that the strings you're passing into Python are encoded as UTF-8.

In order to help you accomplish that in any detailed capacity, however, you will have to provide more information about how your strings are currently formed, what they contain, and how you're constructing your argument list for exec.

Upvotes: 3

Related Questions