Reputation: 1032
I have been looking around for some time for this question but always end up with something different.
I have the following UTF-32 string: std::u32string utf32s = U"जि";
And I would like to convert to an UnicodeString: UnicodeString ustr;
I am using the ICU 65.1 library in C++ to deal with Unicode String for normalization and composition, I found the following link which describe in a very poor way the conversion between strings. Especially the following description:
Conversion of whole strings: u_strFromUTF32()
and u_strFromUTF32()
in ustring.h
.
Access to code points is trivial and does not require any macros.
Using a UTF-32 converter with all of the ICU conversion APIs in ucnv.h, including ones with an "Algorithmic" suffix.
UnicodeString has fromUTF32()
and toUTF32()
methods.
The alternative I have found is the following template function:
template <typename T>
void fromUTF32(const std::u32string& source, std::basic_string<T, std::char_traits<T>, std::allocator<T>>& result)
{
wstring_convert<codecvt_utf8_utf16<T>, T> convertor;
result = convertor.from_bytes(source);
}
This function anyhow seams not to recognize UnicodeString as valid input. More in general, given a string (wstring, string, u16string ...) how to create a template function to get it as a Unicode String ?
Many thanks !
Upvotes: 1
Views: 1428
Reputation: 52336
#include <iostream>
#include <string>
#include <unicode/unistr.h>
#include <unicode/ustream.h>
int main() {
std::u32string utf32s = U"जि";
auto ustr = UnicodeString::fromUTF32(
reinterpret_cast<const UChar32 *>(utf32s.c_str()), utf32s.size());
std::cout << ustr << '\n';
return 0;
}
$ g++ u32.cpp $(icu-config --cxxflags --ldflags --ldflags-icuio)
$ ./a.out
जि
Upvotes: 3
Reputation: 136208
You should probably use icu::UnicodeString::fromUTF32
:
icu::UnicodeString asUnicodeString(std::u32string const& s) {
static_assert(sizeof(std::u32string::value_type) == sizeof(UChar32), "");
static_assert(alignof(std::u32string::value_type) == alignof(UChar32), "");
return icu::UnicodeString::fromUTF32(reinterpret_cast<UChar32 const*>(s.data()), s.size());
}
Upvotes: 2