user2399415
user2399415

Reputation: 207

wchar_t encoding on different platforms

I faced a problem with encodings on different platforms (in my case Windows and Linux). On windows, size of wchar_t is 2 bytes, whereas on Linux it's 4 bytes. How can I "standardize" wchar_t to be same size for both platforms? Is it hard to implement without additional libraries? For now, I'm aiming for printf/wprintf API. The data is sent via socket communication. Thank you.

Upvotes: 4

Views: 630

Answers (2)

Isaac
Isaac

Reputation: 215

To solve this problem I think you are going to have to convert all strings into UTF-8 before transmitting. On Windows you would use the WideCharToMultiByte function to convert wchar_t strings to UTF-8 strings, and MultiByteToWideChar to convert UTF-8 strings into wchar_t strings.

On Linux things aren't as straightforward. You can use the functions wctomb and mbtowc, however what they convert to/from depends on the underlying locale setting. So if you want these to convert to/from UTF-8 and Unicode then you'll need to make sure the locale is set to use UTF-8 encoding.

This article might also be a good resource.

Upvotes: 0

Mr.C64
Mr.C64

Reputation: 43044

If you want to send Unicode data across different platforms and architectures, I'd suggest using UTF-8 encoding and (8-bit) chars. UTF-8 has some advantages like not having endiannes issues (UTF-8 is just a plain sequence of bytes, instead both UTF-16 and UTF-32 can be little-endian or big-endian...).

On Windows, just convert the UTF-8 text to UTF-16 at the boundary of Win32 APIs (since Windows APIs tend to work with UTF-16). You can use the MultiByteToWideChar() API for that.

Upvotes: 3

Related Questions