Reputation: 366
I just did some successful tests with ICU in C/C++. I need to parse different CSV files with different encodings (might be UTF-8, UTF-16LE, ), do some modifications on the data and finally output everything as UTF-8 into a file. That's why my choice fell for ICU. Character set detection works pretty well usually, character handling and conversion to UTF-8 too.
Now I wanted to integrate that library part that does CSV loading, manipulation and so on with a GUI library, Nana. Nana seems to use std::string and std::wstring.
As ICU stores all data internally as UTF-16, so either I got UChars or UnicodeStrings when working with ICU. But how could I use either of them with Nana, that doesn't 'integrate' with ICU? Any way to transform UChar arrays to wstring, or a UnicodeString to wstring?
Didn't find any hints in the ICU documentation, so...maybe somebody else already worked on that?
Upvotes: 1
Views: 520
Reputation: 3571
Most nana functions expect std::string
encoded in UTF-8.
You could use the ICU functions that take or return char *
to do the conversion to UTF-8.
A few of nana functions, like widget::caption
have overloads for std::wstring
expected to be encoded in UTF-16 (in windows) or UTF-32 (in Linux) which could be used to pass to the OS what could be the string with the native character type and encoding.
In case you need conversions nana offers nana::charset
which can manage (explicitly or implicitly) some of the most frequently needed conversions from/to UTF-8/UTF-16/UTF-32.
If you experiment passing static_cast<wchar_t *>(some_UChar*)
to nana, please tell us about the result. I can't test.
The nana documentation about Unicode treatment urgently need to be updated (mea culpa)
Upvotes: 1
Reputation: 149075
According to ICU documentation, a UChar array is an array of 16 bits wide characters... meaning a wchar_t array in common implementations. That means that provided wchar_t is 16 bits wide in your system, you can safely cast the result of the getTerminatedBuffer()
function to a const wchar_t *
and either use it directly as a C wide chararcter string, or use it to build a std::wstring
.
Upvotes: 0