Slazer
Slazer

Reputation: 4990

Implementing an iterator over binary (or arbitrary) tree using C++ 11

I would like to create an iterator over the binary tree so as to be able to use range-based for loop. I understand I ought to implement the begin() and end() function first.

Begin should probably point to the root. According to the specification, however, the end() functions returns "the element following the last valid element". Which element (node) is that? Would it not be illegal to point to some "invalid" place?

The other thing is the operator++. What is the best way to return "next" element in tree? I just need some advice to begin with this programming.


I would like to expand/augment my question*. What if I wanted to iterate over a tree with an arbitrary arity? Let each node have a vector of children and let begin() point to the "real" root. I would probably have to implement a queue (for breadth-first) inside the iterator class to store the unique_ptr's to nodes, right? Then, when the queue is empty I would know that I have passed all nodes and thus should return TreeIterator(nullptr) when oprator++() is called. Does it make sense? I want it as simple as possible and only forward iteration.

*Or should I create a new thread?

Upvotes: 11

Views: 27780

Answers (3)

PiotrNycz
PiotrNycz

Reputation: 24347

Look at SGI implementation of RBTree (this is the base for std::set/std::map... containers).

http://www.sgi.com/tech/stl/stl_tree.h

You will see that begin() is the leftmost node.

You will see that end() is a special "empty" node header which is the super root - I mean a real root (preset only if the tree is not empty) is its child node.

operator ++ is to go to right child and then find this child leftmost node. If such child does not exist - we go to left parent of the rightmost parent node. As in this example (red lines are skip move, blue ones ends are the given steps of iteration):

enter image description here

The code copied from SGI:

  void _M_increment()
  {
    if (_M_node->_M_right != 0) {
      _M_node = _M_node->_M_right;
      while (_M_node->_M_left != 0)
        _M_node = _M_node->_M_left;
    }
    else {
      _Base_ptr __y = _M_node->_M_parent;
      while (_M_node == __y->_M_right) {
        _M_node = __y;
        __y = __y->_M_parent;
      }
      if (_M_node->_M_right != __y)
        _M_node = __y;
    }
  }

When a tree is empty - begin() is leftmost of header - so it is header itself - end() is also header - so begin() == end(). Remember - your iteration scheme must match this condition begin() == end() for empty trees.

This seems to be very smart iteration scheme.

Of course you can define you own scheme - but the lesson learned is to have special node for end() purpose.

Upvotes: 11

Vaughn Cato
Vaughn Cato

Reputation: 64298

An iterator for a tree is going to be more than just a pointer, although it will likely contain a pointer:

struct TreeIterator {
  TreeIterator(TreeNode *node_ptr) : node_ptr(node_ptr) { }
  TreeIterator &operator++();
  TreeIterator operator++(int);
  bool operator==(const TreeIterator &) const;

  private:
    TreeNode *node_ptr;
};

TreeIterator begin(const Tree &tree) { ... }
TreeIterator end(const Tree &tree) { ... }

You can make your end() function return something special, like TreeIterator(nullptr)

What your begin iterator points to will depend on the type of traversal that you want. If you are doing breadth-first traversal, then starting at the root makes sense.

Upvotes: 1

Dietmar Kühl
Dietmar Kühl

Reputation: 153792

Where your begin() should point to pretty much depends on the order in which you want to traverse your tree. Using the root may be sensible, e.g., for a breadth first walk of the tree. The end() doesn't really sit on a tree node: this position is accessed and indicates that the end of the sequence is reached. Whether it indicates anything related to the tree somewhat depends on what sort of iteration you want to support: when supporting only forward iteration it can just indicate the end. When also supporting bidirectional iteration, it needs to know how to find the node right before the end.

In any case, the place pointed to isn't really accessed and you need a suitable indicator. For a forward iteration only iterator end() could just return an iterator pointing to null and when you move on from the last node you just set the iterator's pointer to null as well: equality comparing the two pointers would yield true, indicating that you have reached the end. When wanting to support bidirectional iteration you'll need some sort of link record which can be used to navigate to the previous node but which doesn't need to store a value.

The ordered associated containers (std::map<K, V>, std:set<V>, etc.) are internally implemented as some sort of tree (e.g., a Red/Black-tree). The begin() iterator starts with the left-most node and the end() iterator refers to the position after the right-most node. The operator++() just finds the next node to the right of the current:

  • if the iterator sits on a node without a right child node, it walks along the chain of parents until it finds a parent reaching its child via the left branch of the tree
  • if it sits on a node with a right child node it walks to the child and then down the sequence of left children of this child (if any) to find the left-most child in the right subtree.

Obviously, if you don't walk your tree from left to right but rather, e.g., from top to bottom, you'll need a different algorithm. The easiest approach for me is to draw a tree on a piece of paper and see how to get to the next node.

If you haven't implemented a data structure using your own iterators I'd recommend trying things out on a simple sequential data structure, e.g., a list: There it is pretty obvious how to reach the next node and when the end is reached. Once the general iteration principle is clear, creating a tree is just a matter of getting the navigation right.

Upvotes: 13

Related Questions