Reputation: 21
Once i write a c program and try to output special characters (like ä ö ü ß) with printf() on the cmd window on windows 10 it only shows sth like ▒▒▒▒▒▒▒▒▒▒▒▒
But if i just type them in the cmd window without a c programm being executed it displays these characters properly. When i change the console type to standard output in netbeans the output is correct as well. I tried to change the codepage of cmd but it didnt fix the problem. I use the gcc c compiler.
Upvotes: 1
Views: 6358
Reputation: 49086
The reason is the usage of different code pages for character encoding.
In GUI text editor on writing program code stored in a file on which each character is encoded with just a single byte the code page Windows-1252 is used in Western European and North American countries.
In console window opened on running a console application an OEM code page is used which is in Western European countries OEM 850 and in North American countries OEM 437.
So you need for ÄÖÜäöüß
different byte values written in code to get those characters displayed as expected in the console window at least on execution in Western European and North American countries.
Character Windows-1252 OEM 850
Ä \xC4 \x8E
Ö \xD6 \x99
Ü \xDC \x9A
ä \xE4 \x84
ö \xF6 \x94
ü \xF1 \x8C
ß \xDF \xE1
The code page used by default in a console window can be seen by opening a command prompt window and run either chcp
(change code page) or mode
which both display the active code page.
The default code page for GUI applications and console applications on a computer for a user account depends on the Windows region and language settings for this user account.
Some web pages you should read to better understand character encoding:
Programmers should not write non ASCII characters into strings output by a compiled executable because it depends on which code page is used by the compiler on creating the binary representation (bytes) of the characters in executable. It is better to use the hexadecimal notation when active code page on execution of the application is known or defined by the application before the string is output.
It is also possible to store strings in the executable in Unicode, determine the encoding of the output handle before output any string and convert each Unicode string to the encoding of the output handle before the string is written to the output handle.
And of course it depends on used output font how the bytes in the strings in the executable are finally really displayed on screen.
Upvotes: 3