Reputation: 161
I need a code in C++ to convert a string given in wchar_t*
to a UTF-16 string. It must work both on Windows and Linux. I've looked through a lot of web-pages during the search, but the subject still is not clear to me.
As I understand I need to:
setlocale
with LC_TYPE and UTF-16 encoding.wcstombs
to convert wchar_t
to UTF-16 string.setlocale
to restore previous locale.Do you know the way I can convert wchar_t*
to UTF-16 in a portable way (Windows and Linux)?
Upvotes: 6
Views: 10451
Reputation: 5787
You can assume that wchar_t is utf-32 in the non-Windows world. It is true on Linux and Mac OS X and most *nix systems (there are very few exceptions to that, and on systems you will probably never touch :-)
And wchar_t is utf-16 on Windows. So on Windows the conversion function can just do a memcpy :-)
On everything else, the conversion is algorithmic, and pretty simple. So there is no need of fancy support from 3rd party libraries.
Here is the basic algorithm: http://unicode.org/faq/utf_bom.html#utf16-3
And you can probably find find a dozen different implementations if you don't want to write your own :-)
Upvotes: 3
Reputation: 18268
The problem is with wchar_t
being rather underspecified. You could use GNU libiconv to do what you want. It accepts special encoding name "wchar_t"
as both source and target encoding. That way it will be portable to both Windows and Linux and elsewhere where you can provide libiconv.
Upvotes: 2
Reputation: 474376
There is no single cross-platform method for doing this in C++03 (not without a library). This is in part because wchar_t
is itself not the same thing across platforms. Under Windows, wchar_t
is a 16-bit value, while on other platforms it is often a 32-bit value. So you would need two different codepaths to do it.
Upvotes: 8
Reputation: 53097
C++11's std::codecvt_utf16
should work, I think.
std::codecvt_utf16 is a std::codecvt facet which encapsulates conversion between a UTF-16 encoded byte string and UCS2 or UCS4 character string (depending on the type of Elem).
See this: http://en.cppreference.com/w/cpp/locale/codecvt_utf16
Upvotes: 5