Nikola Stankov
Nikola Stankov

Reputation: 131

Reading cyrillic from Console C++

I'm trying to read cyrillic( "Иванчо говори само глупости") from the console, but everything I get is "????". For the first time i write on C++ and I will be very greatful if someone help me to solve this problem.

This is my code

#include<iostream>
#include<string>
#include<map>
#include<Windows.h>
#include<clocale>

using namespace std;

bool CheckLetters(int letter)
{
    SetConsoleCP(1251);
    SetConsoleOutputCP(1251);

    bool isCyrillic = ('\u0410' <= letter && letter <= '\u044f');
    if ((letter >= 'a' && letter <= 'z')
        || (letter >= 'A' && letter <= 'Z')
        || isCyrillic)
    {
        return true;
    }
    return false;
}

int main()
{
    string input;
    map<unsigned char, int> letters;

    getline(cin, input);

    for (int i = 0; i < input.size(); i++)
    {
        unsigned char currentLetter = input[i];
        if (CheckLetters(currentLetter))
        {
            map<unsigned char, int>::iterator elementIter = letters.find(currentLetter);
            if (elementIter == letters.end())
            {
                letters[currentLetter] = 1;
            }
            else
            {
                letters[currentLetter] ++;
            }
        }

    }

    for (map<unsigned char, int>::iterator current = letters.begin();
         current != letters.end(); current++)
    {
        pair<unsigned char, int> currentElement = *current;
        cout << currentElement.first << " " << currentElement.second <<endl;
    }

    return 0;
}

enter image description here

Upvotes: 0

Views: 3409

Answers (3)

Nikola Stankov
Nikola Stankov

Reputation: 131

My main problem was, that I didn't set the encoding in VS in the beginning. So, I make new project and set the codepage to 1251. This is my code:

#include<iostream>
#include<string.h>
#include<map>
#include<windows.h>
#include<locale>

using namespace std;

bool CheckLetters(wchar_t letter)
{
    bool isCyrillic = 65472 <= letter && letter <= 65535;
    if ((letter >= 'a' && letter <= 'z')
        || (letter >= 'A' && letter <= 'Z')
        || isCyrillic)
    {
        return true;
    }
    return false;
}


int main()
{

    SetConsoleCP(1251);
    SetConsoleOutputCP(1251);

    wstring input;
    map<wchar_t, int> letters;

    getline(wcin, input);

    for (int i = 0; i < input.size(); i++)
    {
        char currentLetter = input[i];

        if (CheckLetters(currentLetter))
        {
            map<wchar_t, int>::iterator elementIter = letters.find(currentLetter);
            if (elementIter == letters.end())
            {
                letters[currentLetter] = 1;
            }
            else
            {
                letters[currentLetter] ++;
            }
        }

    }

    for (map<wchar_t, int>::iterator current = letters.begin();
        current != letters.end(); current++)
    {
        pair<wchar_t, int> currentElement = *current;
        cout << (char)(currentElement.first) << " = " << currentElement.second << endl;
    }

    return 0;
}

Thanks to all that gives me advice.

Upvotes: 1

Barmak Shemirani
Barmak Shemirani

Reputation: 31599

Unicode is recommended over changing the code page to Russian or any specific language. Windows APIs use UTF16, unfortunately Windows console has limited Unicode support. Here is a solution which is specific to Windows console and Visual Studio (it won't work with MinGW for example). It still won't work with some Asian languages (or at least I don't know how to make it work)

#include <iostream>
#include <string>
#include <io.h> //for _setmode
#include <fcntl.h> //for _O_U16TEXT

int main() 
{
    _setmode(_fileno(stdout), _O_U16TEXT);
    _setmode(_fileno(stdin), _O_U16TEXT);
    std::wcout << L"ελληνικά Иванчо English\n";

    std::wstring str;
    std::wcin >> str;
    std::wcout << "output: " << str << "\n";

    return 0;
}

Note that you cannot use std::cin and std::cout after changing mode to UTF16. You have to set mode back to _O_TEXT if you want to keep using ANSI input/output. Example:

_setmode(_fileno(stdout), _O_TEXT);
_setmode(_fileno(stdin), _O_TEXT);
std::cout << "Test\n";

After receiving input in UTF16, you may want to use WideCharToMultiByte(CP_UTF8, ...) convert to UTF8 (which is stored in char) for compatibility with network functions etc.

Upvotes: 4

Vadzim Savenok
Vadzim Savenok

Reputation: 940

How about this combination?

setlocale(LC_ALL, "Russian");
SetConsoleOutputCP(866);

Upvotes: 2

Related Questions