Reputation: 22134
While testing some functions to convert strings between wchar_t and utf8 I met the following weird result with Visual C++ express 2008
std::wcout << L"élève" << std::endl;
prints out "ÚlÞve:" which is obviously not what is expected.
This is obviously a bug. How can that be ? How am I suppose to deal with such "feature" ?
Upvotes: 3
Views: 1261
Reputation: 90985
This is obviously a bug. How can that be?
While other operating systems have dispensed with legacy character encodings and switched to UTF-8, Windows uses two legacy encodings: An "OEM" code page (used at the command prompt) and an "ANSI" code page (used by the GUI).
Your C++ source file is in ANSI code page 1252 (or possibly 1254, 1256, or 1258), but your console is interpreting it as OEM code page 850.
Upvotes: 1
Reputation: 5787
You IDE and the compiler use the ANSI code page. The console uses the OEM code page.
It also matter what are you doing with those conversion functions.
Upvotes: 0
Reputation: 17405
The C++ compiler does not support Unicode in code files. You have to replace those characters with their escaped versions instead.
Try this:
std::wcout << L"\x00E9l\x00E8ve" << std::endl;
Also, your console must support Unicode as well.
UPDATE:
It's not going to produce the desired output in your console, because the console does not support Unicode.
Upvotes: 12
Reputation: 22134
I found these related questions with useful answers Is there a Windows command shell that will display Unicode characters? How can I embed unicode string constants in a source file?
Upvotes: 2
Reputation: 81268
You might also want to take a look at this question. It shows how you can actually hard-code unicode characters into files using some compilers (I'm not sure what the options would be got MSVC).
Upvotes: 1