Oodini
Oodini

Reputation: 1347

Facet ctype, do_is() and specializations

I derived the ctype class to build my own facet in order to override its virtual function do_is(). My purpose is to make the stream extractor ignore the space characters (and still tokenize on the tabulation character). This overriding calls the implementation of the mother class. But it will only compile with wchar_t. There is no implementation of ctype::do_is() for the char template value. That's true for gcc and VS 2010.

Here is my code ; you just have to uncomment the 5th line to do the test between the two versions.

#include <iostream>
#include <locale>
#include <sstream>

// #define WIDE_CHARACTERS

#ifdef WIDE_CHARACTERS
typedef wchar_t CharacterType;
std::basic_string<CharacterType> in = L"string1\tstring2 string3";
std::basic_ostream<CharacterType>& consoleOut = std::wcout;
#else
typedef char CharacterType;
std::basic_string<CharacterType> in = "string1\tstring2 string3";
std::basic_ostream<CharacterType>& consoleOut = std::cout;
#endif

struct csv_whitespace : std::ctype<CharacterType>
{
    bool do_is(mask m, char_type c) const
    {  
        if ((m & space) && c == ' ')
        {
            return false; // space will NOT be classified as whitespace
        }

        return ctype::do_is(m, c); // leave the rest to the parent class
    }
};

int main()
{
    std::basic_string<CharacterType> token;

    consoleOut << "locale with modified ctype:\n";
    std::basic_istringstream<CharacterType> s2(in);
    s2.imbue(std::locale(s2.getloc(), new csv_whitespace()));
    while (s2 >> token)
    {
        consoleOut << "  " << token << '\n';
    }
}

Upvotes: 0

Views: 429

Answers (2)

Oodini
Oodini

Reputation: 1347

Thank you !

I did the following code from the link you give, and that does work.

#include <iostream>
#include <vector>
#include <locale>
#include <sstream>

// This ctype facet declassifies spaces as whitespace
struct CSV_whitespace : std::ctype<char>
{
    static const mask* make_table()
    {
        // make a copy of the "C" locale table
        static std::vector<mask> v(classic_table(), classic_table() + table_size);

        // space will not be classified as whitespace
        v[' '] &= ~space;

        return &v[0];
    }

    CSV_whitespace(std::size_t refs = 0) : ctype(make_table(), false, refs) {}
};

int main()
{
    std::string token;

    std::string in = "string1\tstring2 string3";

    std::cout << "locale with modified ctype:\n";
    std::istringstream s(in);
    s.imbue(std::locale(s.getloc(), new CSV_whitespace()));
    while (s >> token)
    {
        std::cout << "  " << token << '\n';
    }
}

Upvotes: 1

David G
David G

Reputation: 96845

Narrow-character streams use table lookup for classification (I presume as an optimization advantage). Your implementation will only work for character types other than char. You can see on the C++ reference page how they use the table to classify characters.

Upvotes: 0

Related Questions