Reputation: 32912
I have troubles with wide string literals using MinGW GCC compiler on Windows.
When I read the user input using wscanf
, wprintf
outputs correct national characters. However wide string literals stops output at the first national character:
wprintf (L"China - Čína"); // outputs "China - "
Assuming the wchar_t is encoded as UTF-16 by default (is it LE or BE?), how does it work when the source is UTF-8 file? I tried to save the source as UTF-16, but I get illegal byte sequence error.
Upvotes: 0
Views: 1211
Reputation: 32912
As @pasztorpisti suggested, I tried memory viewer and the substring Čína is stored as 0C 01 ED 00 6E 00 61 00
, which is correct in UTF-16LE.
My console uses CP852 as default codepage, so I tried chcp 1200
but it is not set! MSDN says it is for managed applications only - Microsoft knows how to create a coding hell.
It was very useful to read carefully this answer: I used WriteConsoleW
to produce the UTF-16LE output in the cripled console:
void putws(const wchar_t* str) {
WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), str, wcslen(str), NULL, NULL);
}
putws(L"China - Čína"); // outputs "China - Čína"
Upvotes: 2