user2503048
user2503048

Reputation: 1041

C++ custom lazy iterator

I have a somewhat simple text file parser. The text I parse is split into blocks denoted by { block data }.

My parser has a string read() function, which gets tokens back, such that in the example above the first token is { followed by block followed by data followed by }.

To make things less repetitive, I want to write a generator-like iterator that will allow me to write something similar to this JavaScript code:

* readBlock() {
    this.read(); // {

    let token = this.read();

    while (token !== '}') {
        yield token;

        token = this.read();
    }
}

which in turn allows me to use simple for-of syntax:

for (let token of parser.readBlock()) {
    // block
    // data
}

For C++ I would like something similar:

for (string token : reader.read_block())
{
    // block
    // data
}

I googled around to see if this can be done with an iterator, but I couldn't figure if I can have a lazy iterator like this which has no defined beginning or end. That is, its beginning is the current position of the reader (an integer offset into a vector of characters), and its end is when the token } is found. I don't need to construct arbitrary iterators, or to iterate in reverse, or to see if two iterators are equal, since it's purely to make linear iteration less repetitive.

Currently every time I want to read a block, I need to re-write the following:

stream.skip(); // {
while ((token = stream.read()) != "}")
{
    // block
    // data
}

This becomes very messy, especially when I have blocks inside blocks. To support blocks inside blocks, the iterators would have to all reference the same reader's offset, such that an inner block will advance the offset, and the outer block will re-start iterating (after the inner is finished) from that advanced offset.

Is this possible to achieve in C++?

Upvotes: 3

Views: 2478

Answers (1)

anarthal
anarthal

Reputation: 170

In order to be usable in a for-range loop, a class has to have member functions begin() and end() which return iterators.

What is an iterator? Any object fulfilling a set of requirements. There are several kind of iterators, depending on which operations allow you. I suggest to implement an input iterator, which is the simplest: https://en.cppreference.com/w/cpp/named_req/InputIterator

class Stream
{
public:
    std::string read() { /**/ }
    bool valid() const { /* return true while more tokens are available */ }
};

class FileParser
{
    std::string current_;
    Stream* stream_;
public:
    class iterator
    {
        FileParser* obj_;
    public:
        using value_type = std::string;
        using reference = const std::string&;
        using pointer = const std::string*;
        using iterator_category = std::input_iterator_tag;
        iterator(FileParser* obj=nullptr): obj_ {obj} {}
        reference operator*() const { return obj_->current_; }
        iterator& operator++() { increment(); return *this; }
        iterator operator++(int) { increment(); return *this; }
        bool operator==(iterator rhs) const { return obj_ == rhs.obj_; }
        bool operator!=(iterator rhs) const { return !(rhs==*this); }
    protected:
        void increment()
        {
            obj_->next();
            if (!obj_->valid())
                obj_ = nullptr;
        }
    };


    FileParser(Stream& stream): stream_ {&stream} {};
    iterator begin() { return iterator{this}; }
    iterator end() { return iterator{}; }
    void next() { current_ = stream_->read(); }
    bool valid() const { return stream_->valid(); }
};

So your end-of-file iterator is represented by an iterator pointing to no object.

Then you can use it like this:

int main()
{
    Stream s; // Initialize it as needed
    FileParser parser {s};
    for (const std::string& token: parser)
    {
        std::cout << token << std::endl;
    }
}

Upvotes: 4

Related Questions