Float
Float

Reputation: 25

Setlocale not working

I'm trying to use the setlocale function so I can use Portuguese characters in the Windows console.

This is my code:

#include <stdio.h>
#include <stdlib.h>
#include <locale.h>

int main()
{
    setlocale(LC_ALL, "Portuguese");
    printf("Bem-Vindo ao CALCULADORA SIMULATOR 2018 - FOSÓRIO EDITION\n");
}

But this is my output:

Bem-Vindo ao CALCULADORA SIMULATOR 2018 - FOSÃ"RIO EDITION

Every other text that is written in cmd is shown correctly, only my program's output has this problem.

And looks like any different character is changed to "Ã" plus another random character. For example: printf("áàóõãÃ\n"); Outputs this:

áà óõãÃ

Upvotes: 1

Views: 5407

Answers (3)

David Shang
David Shang

Reputation: 11

Call SetConsoleOutputCP(CP_UTF8) and SetConsoleCP(CP_UTF8) at the start of your application. This will enable Windows console to display UTF8 characters. And then, you can use printf, instead of wprintf to print your single-byte strings encoded by UTF-8, and make your code compatible with the world outside of Windows.

Enable "Beta: Use Unicode UFT-8 for worldwide language support" in Windows settings dialogue is not a good solution. You certainly do not want your user to do this in order to run your code. Besides, this option is problematic, it may cause Windows console to display garbage or stop working in certain cases.

Upvotes: 0

Jos&#233; Silva
Jos&#233; Silva

Reputation: 41

I have the same problem in windows and after change the chcp (850, 860 and 1252) the problem remains.

So I try to use the wprintf but this didn't work too.

I only managed to get it to work when I changed in the Settings > Time & Language > Language > Administrative Language Settings > Change System Local > and here select the BETA Unicod UTF-8 and click ok button. After restarting the computer the problem with printf("áàóõãÃ\n") are solved.

Upvotes: 4

zwol
zwol

Reputation: 140445

First, setlocale takes ISO 639-1 language codes, not the full names of languages in English (plus suffixes that let you, for instance, distinguish Brazilian from European Portuguese; the complete syntax is documented in MSDN under "Locale Names, Languages, and Country/Region Strings").

Second, the output you got, with a string of accented letters Óáàóõãà each becoming a two-character sequence starting with Ã, is a characteristic mojibake pattern for UTF-8 being misinterpreted as Windows-1252. UTF-8 is a variable-length encoding for Unicode "codepoints", in which the accented characters you're trying to use each become two-byte sequences; Windows-1252 is a fixed-length encoding, so each of those pairs of bytes is misinterpreted as two characters. Here's how it happens for those specific characters:

character   codepoint    UTF-8 two-byte sequence   Windows-1252
---------   ---------    -----------------------   ------------
Ó           U+00D3       0xC3 0x93                 Ã “
á           U+00E1       0xC3 0xA1                 Ã ¡
à           U+00E0       0xC3 0xA0                 Ã □
ó           U+00F3       0xC3 0xB3                 Ã ³
õ           U+00F5       0xC3 0xB5                 Ã µ
ã           U+00E3       0xC3 0xA3                 Ã £
à           U+00C3       0xC3 0x83                 à ƒ

(the white square on the à line is standing in for a non-breaking space)

This is a typical way for "narrow" text output to be mangled by the Windows console. Windows uses UTF-16 for almost everything internally, which means it often works better to use C's "wide character" library. Try this program instead:

#include <wchar.h>
#include <locale.h>

int main(void)
{
    setlocale(LC_ALL, "pt"); // also try "pt_BR"
    wprintf(L"Bem-Vindo ao CALCULADORA SIMULATOR 2018 - FOSÓRIO EDITION\n");
}

Note: Most other operating systems were slower to take the Unicode plunge than Windows was, and came to it only after it had become obvious that UTF-8 was a better choice than UTF-16, which means the "wide character" library should be avoided on all operating systems except Windows. Don't worry about this until you need to write a program that works on both Windows and non-Windows.

Upvotes: 3

Related Questions