Reputation: 16630
I have a program that does various operations on char
types in std::string
, for example
if (my_string.front() == my_char) {
// do stuff with my_string
}
I'm looking for some practical advice on how to make my program support Unicode. I need the ability to compare characters to characters, and that means 4 byte characters are required so that even the largest Unicode characters can be processed without losses.
I'm on Windows with a GCC compiler and read that in this case, std::wstring
is 2 bytes. C++11 has std::u32string
with 4 bytes but it seems largely unsupported by the standard library.
What's the easiest solution in this case?
Upvotes: 1
Views: 306
Reputation: 7996
Even if you had a string of uint32 you could not just compare these integers one by one. You would have to first normalize the strings before. As normalization is NOT simple, you will end up using a library like ICU. So you may directly try to use it directly :)
Upvotes: 2
Reputation: 10536
Windows uses the UTF-16 encoding: http://en.wikipedia.org/wiki/UTF-16
You don't need "four byte characters" to support all unicode symbols. UTF-16 is a variable length encoding.
Good reading material: http://www.joelonsoftware.com/articles/Unicode.html
Upvotes: 1