Captain Jack sparrow
Captain Jack sparrow

Reputation: 1019

Convert ICU Unicode string to std::wstring (or wchar_t*)

Is there an icu function to create a std::wstring from an icu UnicodeString ? I have been searching the ICU manual but haven't been able to find one.

(I know i can convert UnicodeString to UTF8 and then convert to platform dependent wchar_t* but i am looking for one function in UnicodeString which can do this conversion.

Upvotes: 1

Views: 1998

Answers (1)

dreamlax
dreamlax

Reputation: 95335

The C++ standard doesn't dictate any specific encoding for std::wstring. On Windows systems, wchar_t is 16-bit, and on Linux, macOS, and several other platforms, wchar_t is 32-bit. As far as C++'s std::wstring is concerned, it is just an arbitrary sequence of wchar_t in much the same way that std::string is just an arbitrary sequence of char.

It seems that icu::UnicodeString has no in-built way of creating a std::wstring, but if you really want to create a std::wstring anyway, you can use the C-based API u_strToWCS() like this:

icu::UnicodeString ustr = /* get from somewhere */;
std::wstring wstr;

int32_t requiredSize;
UErrorCode error = U_ZERO_ERROR;

// obtain the size of string we need
u_strToWCS(nullptr, 0, &requiredSize, ustr.getBuffer(), ustr.length(), &error);

// resize accordingly (this will not include any terminating null character, but it also doesn't need to either)
wstr.resize(requiredSize);

// copy the UnicodeString buffer to the std::wstring.
u_strToWCS(wstr.data(), wstr.size(), nullptr, ustr.getBuffer(), ustr.length(), &error);

Supposedly, u_strToWCS() will use the most efficient method for converting from UChar to wchar_t (if they are the same size, then it is just a straightfoward copy I suppose).

Upvotes: 2

Related Questions