kfmfe04
kfmfe04

Reputation: 15327

How to implement an iterator/container that is standards-compliant?

Suppose I have an abstract container-like class called RuleBook. Users of RuleBook expect to be able to forward-iterate over the RuleBook to obtain Rules.

Unlike standard containers, there will be no restrictions on the memory layout of the the concrete subclasses here. Instead, it is up to the implementers of the subclasses to comply with the forward-iteration requirement of RuleBook and satisfy this requirement based on its own data structures.

I am thinking that RuleBook should contain pure virtual begin() and end() so it would work with range-based for, but I am running into a few problems.

What should the signatures be for begin() and end()? How should BasketballRules and CompanyRules be implemented?

How do we deal with the end condition when the iterator goes past the last item?

In the example below, you can assume that m_rp and m_rpp only point to one Rule each. We want to return some kind of iterator for the guts (like a Rule*). We can also assume that all subclasses of Foo will contain Rule in various data structures, which will be up to the whim of the implementor.

If I implement the entire thing using Rule* as my iterator and null_ptr as my beyond-the-endpoint, would this be STL compliant?

I am currently looking into custom iterators, but I'm not even sure this problem fits well with that paradigm because each subclass must effectively define the guts of the iteration.

CODE

struct RuleBook
{
  // virtual Rule* begin() = 0;
  // virtual Rule* end() = 0; 
};

struct CompanyRules :
    public RuleBook
{
    Rule m_r;
};

struct BasketballRules :
    public RuleBook
{
    // return the three Rules, one-at-a-time, in succession
    Rule   m_r;
    Rule*  m_rp;
    Rule** m_rpp;
};

int
main( int argv, char* argc[] )
{
}

Upvotes: 2

Views: 435

Answers (4)

Jonathan Wakely
Jonathan Wakely

Reputation: 171343

This will be difficult to get right.

What should the signatures be for begin() and end()?

There's not much choice, they pretty much have to be something like

RuleBook::iterator begin();
RuleBook::iterator end();

(Adding const overloads if desired)

How should BasketballRules and CompanyRules be implemented?

Carefully :)

How do we deal with the end condition when the iterator goes past the last item?

You design your iterator types correctly so it Just Works. You need an iterator type that can be compared for equality and can be incremented. When you have an iterator to the last item in a container and you increment it, it must become equal to the past-the-end iterator.

If I implement the entire thing using Rule* as my iterator and null_ptr as my beyond-the-endpoint, would this be STL compliant?

No. If your iterator type is just Rule* then incrementing an iterator doesn't move to the next Rule, it just points to the next location in memory, which might not even be a Rule object, leading to undefined behaviour. e.g. for BasketballRules if you have Rule* pointing to m_r and you increment it you are not pointing to a valid Rule object you're pointing to the memory occupied by m_rp i.e. a Rule* and dereferencing it is undefined behaviour.

Also, if you keep incrementing a Rule* you never reach the past-the-end nullptr value.

I gave Yakk's answer an upvote because it is a plausible implementation, but it would be hard to get right. There are lots of things to consider and include in the polymorphic interface e.g. what happens if you use == to compare two RuleBook::iterator objects where one points to a CompanyRules and one points to a BasketballRules, how does equality work for polymorphic iterators?

What happens if you assign the iterator to the BasketballRules object to an iterator to a CompanyRules object? You need a "deep copy" for the polymorphic types.

You'd need a different derived iterator-impl type for each container, the iterator type for CompanyRule containers needs to know everything about the CompanyRule type, and so on for every derived container type. Each of those concrete iterator-impl types needs to implement almost the entire iterator interface as virtual functions. The difficulty of implementing it indicates a problem with the design.

A simpler design would be for each derived container type to manage an actual physical container of the same type. The code specific to each derived container consists of just updating the contents of the list whenever the derived object's contents change. The iterator types are then straightforward and non-polymorphic e.g.

struct RuleBook
{
  typedef std::vector<Rule*> list_type;
  typedef list_type::iterator iterator;

  virtual iterator begin() = 0;
  virtual iterator end() = 0;
};

struct CompanyRules :
    public RuleBook
{
    CompanyRules() : m_list{ &m_r } { }
    Rule m_r;

    iterator begin() { return m_list.begin(); }
    iterator end() { return m_list.end(); }

private:
    list_type m_list;
};

struct BasketballRules :
    public RuleBook
{
    BaseketballRules() : m_list{ &m_r, m_rp, *m_rpp } { }

    // return the three Rules, one-at-a-time, in succession
    Rule   m_r;
    Rule*  m_rp;
    Rule** m_rpp;

    iterator begin() { return m_list.begin(); }
    iterator end() { return m_list.end(); }

private:
    // N.B.
    // must update m_list[1] any time m_rp changes
    // must update m_list[2] any time the pointee of m_rpp changes (harder!)
    list_type m_list;
};

Upvotes: 2

Yakk - Adam Nevraumont
Yakk - Adam Nevraumont

Reputation: 275600

Boost has some iterator helper template classes -- crtp that require you to implement a handful of methods. A pImpl based user of them would allow compliant and virtual behavior for your iterators. Ie, the pImpl is a pure virtual abstract class which the iterator delegates its work to.

Write a pure virtual pImpl iterator. That is the return type of begin and end. Child classes make instances of the iterator with concrete pImpl instances, custom written for how the data is stored.

Upvotes: 1

ritsz
ritsz

Reputation: 1

May be you could try something like the way Alex Allain explains over here in his tutorial about range-based for loops. It also has an example for making a data structure that is iterable by a range-based for loop

The things that are required are

  1. a .begin() and a .end() method. They can be standalone as well.
  2. operator!= , ++ and * overloading so that iterator operations that need to be performed are supported.

Upvotes: 0

bames53
bames53

Reputation: 88195

The exact signatures don't matter because range-based for-loops are defined in terms of the expression used. If begin or end member functions are found then they are called like __range.begin() and __range.end(). An example of the signature not mattering is that these member functions could have any number and type of parameters as long as they can be called like .begin() and .end() (which means the parameters must have default values).

If the type does not have begin and end member functions then the expressions begin(__range) and end(__range) are used. (where __range is auto &&__range initialized with the expression you use in the range-for-loop). So again the exact signature does matter as long as the argument dependent lookup works.


How should BasketballRules and CompanyRules be implemented?

How do we deal with the end condition when the iterator goes past the last item?

These are separate questions from how range-base-for-loops work. You should ask other questions about these specifically.

But if you use pointers as your iterators then using nullptr as the end iterator would not be appropriate because incrementing the last valid pointer would not give you a null pointer; it would give you a pointer one past the end of the range. Also if you use Rule* as your iterator then you're not leaving the implementation up to the container class; the container will have to maintain a contiguous array of Rules.

Upvotes: 1

Related Questions