Andrea
Andrea

Reputation: 4493

How to read correctly european characters (from file and command shell) in C++?

In my program I'm reading a text file using ifstream for opening it using stringstream for reading each line (using getline for tokenize); when I get an european character, like "è", it saves this character with "├¿", and this works as expected, because I'm using string and not wstring. But when I get a line from cmd (I'm using Windows) the word "è" is saved as "è" inside the string. My objective is to compare strings read from file and from command shell, but if they are encoded in different ways I'm stucked, because "è".compare("├¿") is naturally != 0. I would like to have both "wrong" or both correct, because my aim is not showing them but just counting occurrencies. I'm programming using latest version of Code::Blocks, with MinGW 32-bit and gcc 4.7.1

UPDATE (code)

ifstream file;
stringstream stream;

file.open(path);

while( file ){

    while( getline(file,line) ){

        it = 1;
        stream << line;

        if( line.compare("")!=0 ){
            while( getline(stream,token,'\t')) {

                if( it == 1 ){
                    ID = atoi( token.c_str() );
                }
                if( it == 2 ){
                    word = token;

                    if( !case_sensitive ){
                        word = get_lower_case( word );
                    }
                }
                if( it == tags_index ){
                    pos = token;
                }

                it++;
            }

            data.push_back(make_row(ID,word,pos));
        }

        stream.clear();
    }
}

This is part of the function I use to read file (I have a struct for store each entry of a tabulated file, my problem is with "word").

getline(cin,sentence);

[...]

stringstream stream;
string token;
vector<string> tokens;

stream << sentence;
while( getline(stream,token,' ') ){
    tokens.push_back(token);
}
stream.clear();

This is how I read the input stream in the command shell.

Upvotes: 0

Views: 125

Answers (1)

vsoftco
vsoftco

Reputation: 56577

You can try setting (imbuing) the locale

#include <iostream>
#include <locale>

int main()
{
    auto loc = std::locale("it_IT"); // Example: Italian locale
    std::cin.imbue(loc); // imbue it to input stream, can use a fstream here
    std::cout.imbue(loc); // imbue it to output stream

    // rest of the program
}

Upvotes: 1

Related Questions