Spixmaster
Spixmaster

Reputation: 193

C++ - Why isn't the unicode output correct?

I am working for several days with unicode in C++ now and it is very unclear for me. I have a few questions about its usage and I would be happy if they could be answered. The goal is simply that the output is the string with the proper unicode.

As far as I understood, � is put out when the char is broken. Like when you try to cast a wchat_t to a char.

About my machine OS: kubuntu 19.10

g++ --version

g++ (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

1. Why does this work as std::string should only be capable of storing chars which "é" is not?

setlocale(LC_ALL, "en_US.utf8");
std::cout << "é" << std::endl;

output: é

2. Printing a wchar_t is very strange. Why is the following output as it is?

setlocale(LC_ALL, "en_US.utf8");
wchar_t a = L'é';
std::cout << a << std::endl;

output: 233
setlocale(LC_ALL, "en_US.utf8");
wchar_t a = L'é';
std::wcout << a << std::endl;

output: �
setlocale(LC_ALL, "en_US.utf8");
wchar_t a = L'é';
printf("%lc\n", a);

output: é
setlocale(LC_ALL, "en_US.utf8");
wchar_t a = L'é';
wprintf(L"%lc\n", a);

output: é

PS: setlocale(LC_ALL, "en_US.utf8") is there as suggested by this source. Otherwise, std::wcout would print question marks instead of the proper chars.

Upvotes: 2

Views: 270

Answers (1)

AProgrammer
AProgrammer

Reputation: 52324

  • g++ is using UTF-8 as its default execution charset. You can change it with -fexec-charset= but that means that your "é" in your first exemple is coded in UTF-8.

  • 2.a There is no operator<< taking an ostream and a wchar_t. That means that the later is promoted and displayed as a number (wchar_t like char is an integral type).

The other are working as expected. I don't think more explanation is needed. Yet one thing to be aware of is that there is a need to have your environment correctly configured. That's why I asked you to pipe the output in | od -t x1 to check that the output was the expected one. As it is, the issue is a display issue and if you still had it, you'd have to check the configuration of your terminal emulator.

Upvotes: 2

Related Questions