Reputation: 53
I was writing a small program which was supposed to display a ☻ character to the screen. The program is listed below:
#include <stdio.h>
main()
{
printf("☻\n");
}
However, when I run this program, I get the output of
Γÿ║
Why am I getting this output, and what should I do to get the output I want?
Upvotes: 2
Views: 195
Reputation: 7204
#include <fcntl.h>
_setmode(_fileno(stdout), _O_U16TEXT);
wprintf(L"☻\n");
valter
Upvotes: 1
Reputation: 881103
You're getting that because whatever terminal program you're using isn't that compatible with some Unicode encodings.
For example, my Debian box compiles that fine and it actually prints out the smiley face, because gnome-terminal
is a damn fine piece of software :-)
The fact that you're seeing three characters instead of one is a fairly good indication that it's outputting UTF-8. In fact, if I run that program on my Debian box and capture the binary output with od -xcb
, I see:
0000000 98e2 0abb
342 230 273 \n
342 230 273 012
0000004
showing that it is coming out in UTF-8, it's just that gnome-terminal
is smart enough to turn that back into the correct glyph.
Those bytes translate to binary as follows:
e2 98 bb
1110 0010 : 1001 1000 : 1011 1011
And, using this excellent answer here, stating that bit patterns starting with 10
are continuation bytes, we can decode it as follows:
U+000800-U+00ffff 1110yyyy yyyyyyyy xxxxxxxx
10yyyyxx
10xxxxxx
e2 98 bb
1110 0010 : 1001 1000 : 1011 1011
yyyy yy yyxx xx xxxx
Hence the code point is 0010 0110 : 0011 1011
which equates to 263b
which, in a total lack of coincidence, is the black smiling face character.
In terms of fixing the problem of Windows not displaying Unicode correctly, as indicated by your comment:
I am on Windows Command Prompt. How should I make cmd.exe work with unicode?
You may want to look at this question, particularly the answer about using chcp
to change the code page to 65001 (UTF-8). Note I haven't tested this, I provide it only as a pointer for you.
Upvotes: 5