Jan Turoň
Jan Turoň

Reputation: 32912

Wide characters string literal

I have troubles with wide string literals using MinGW GCC compiler on Windows.

When I read the user input using wscanf, wprintf outputs correct national characters. However wide string literals stops output at the first national character:

wprintf (L"China - Čína"); // outputs "China - "

Assuming the wchar_t is encoded as UTF-16 by default (is it LE or BE?), how does it work when the source is UTF-8 file? I tried to save the source as UTF-16, but I get illegal byte sequence error.

Upvotes: 0

Views: 1211

Answers (1)

Jan Turoň
Jan Turoň

Reputation: 32912

As @pasztorpisti suggested, I tried memory viewer and the substring Čína is stored as 0C 01 ED 00 6E 00 61 00, which is correct in UTF-16LE.

My console uses CP852 as default codepage, so I tried chcp 1200 but it is not set! MSDN says it is for managed applications only - Microsoft knows how to create a coding hell.

It was very useful to read carefully this answer: I used WriteConsoleW to produce the UTF-16LE output in the cripled console:

void putws(const wchar_t* str) {
  WriteConsoleW(GetStdHandle(STD_OUTPUT_HANDLE), str, wcslen(str), NULL, NULL);
}

putws(L"China - Čína"); // outputs "China - Čína"

Upvotes: 2

Related Questions