Reputation: 179
I want to print blå
using UTF-8 but I do not know how to do it. UTF-8 for b
is 62, l
is 6c and å
is c3 a5. I am not sure what to make with the å
character. Here is my code:
#include <stdio.h>
int main(void) {
char myChar1 = 0x62; //b
char myChar2 = 0x6C; //l
char myChar3 = ?? //å
printf("%c", myChar1);
printf("%c", myChar2);
printf("%c", myChar3);
return 0;
}
I also tried this:
#include <stdio.h>
#define SIZE 100
int main(void) {
char myWord[SIZE] = "\x62\x6c\xc3\xa5\x00";
printf("%s", myWord);
return 0;
}
However, the output was:
blå
Finally, I tried this:
#include <stdio.h>
#include <locale.h>
#define SIZE 100
int main(void) {
setlocale(LC_ALL, ".UTF8");
char myWord[SIZE] = "\x62\x6c\xc3\xa5\x00";
printf("%s", myWord);
return 0;
}
Same output as before.
I am not sure I understand unicode fully. If I understand it correctly, UTF-16 and UTF-32 use wide characters, where each character requires the same number of bytes (2 or 4 for UTF-16). On the other hand, UTF-8 uses wide characters where the size may vary (1-4 bytes). I know the first 128 characters require 1 byte, and almost all of latin-1 can be described with 2 bytes etc. Since UTF-8 does not require wide characters, I do not need to use wchar functions in my code. Therefore, I do not see why my second and/or third code will not work. My only solution would be to include setmode
to change the encodings of stdin
and stdout
, although I am not sure I that would work and I am not sure how to implement it.
Summary:
Why doesn't my code work?
I am on windows and VScode and have MINGW32 as compiler.
Upvotes: 2
Views: 949
Reputation: 299265
Your second attempt is correct and does output UTF-8 as you wanted. The problem is that your terminal doesn't display UTF-8. See Displaying Unicode in PowerShell and Using UTF-8 Encoding (CHCP 65001) in Command Prompt / Windows Powershell (Windows 10) for discussion of displaying UTF-8 in Windows terminals.
Your current configuration is one in which 0xc3 encodes ├, which is probably CP850, which I believe is the default for some of the mingw-based terminals (MSYS, git bash). It's been a very long time since I've used mingw, but you may also want to see How to set console encoding in MSYS?
Upvotes: 4