user963241
user963241

Reputation: 7048

Size of wchar_t for unicode encoding

is there 32-bit wide character for encoding UTF-32 strings? I'd like to do it via std::wstring which apparently showing me size of a wide character is 16 bits on windows platform.

Upvotes: 4

Views: 19477

Answers (4)

Hunter Kohler
Hunter Kohler

Reputation: 2745

The modern answer for this is to use char32_t (c++11) which can be used with std::u32string. However, in reality, you should just use std::string with a encoding like UTF-8. Note that the old answer to char32_t would be using templates or macros to determine which unsigned integral type has size of 4 bytes, and use that.

Upvotes: 3

Masud Al Mahdi
Masud Al Mahdi

Reputation: 1

Just use typedef!

It would look something like this:

typedef int char_32;

And use it like this:

char_32 myChar;

or as a c-string:

char_32* string_of_32_bit_char = "Hello World";

Upvotes: 0

Michael Aaron Safyan
Michael Aaron Safyan

Reputation: 95629

The size of wchar_t is platform-dependent and it is independent of UTF-8, UTF-16, and UTF-32 (it can be used to represent unicode data, but there is nothing that says that it represents that).

I strongly recommend using UTF-8 with std::string for internal string representation, and using established libraries such as ICU for complex manipulation and conversion tasks involving unicode.

Upvotes: 8

David Heffernan
David Heffernan

Reputation: 613481

You won't be able to do it with std::wstring on many platforms because it will have 16 bit elements.

Instead you should use std::basic_string<char32_t>, but this requires a compiler with some C++0x support.

Upvotes: 8

Related Questions