mmocny
mmocny

Reputation: 8855

C++ Range-adaptors/views and iterator invalidation rules

I have not found any direct reference to range/range-adaptor/range-view specific invalidation rules when modifying the underlying container.

Intuition suggests it would be exactly the same as pointer/iterator invalidation rules -- which are specified within the containers section of the standard.

The current container invalidation wording is as follows:

"...invalidates all the references, pointers, and iterators referring to the elements in the sequence, as well as the past-the-end iterator."

Which raises the question: Do all ranges necessarily "refer to the elements of the sequence", or, could they be accessing elements through the interface of the container?

It seems to me that most range adaptors already access a sequence without referring directly to the elements of that sequence (i.e. lazy views just build up iterator adaptors).

What seems to matter is the underlying range at the base of the view pyramid, so to speak.


We all learn at some point, that you cannot do std::vector::push_back while iterating that same vector, because the memory may move and invalidate the iteration. But, we also learn, that you can use std::vector::operator[] access with push_back, so long as you are careful with checking your size() bounds correctly.

It seems to me the same rules would apply to ranges/adaptors/views.

So: is it possible to force some equivalent to std::ranges::views​::​all (or, perhaps take_view) over a random access container to use array indexing (or some equivalent indirect/lazy element access), and to not use iteration directly?


Something to allow this:

std::vector<People> people = ...;
for (auto& person : std::ranges::views::lazy_all(people)) { // or ranges::lazy_take_view(people, people.size())
  if (person.has_new_child()) {
    people.push_back(person.get_new_child());
  }
}

Upvotes: 8

Views: 814

Answers (3)

Morgan
Morgan

Reputation: 342

I think the best bet to get functionality similar to what you seek is attempting to implement a custom iterator type that doesn't suffer from iterator invalidation in the same way. Doing that should work since the iterators that the view stores are of the type of the iterators used in the range.

One way you may be able to do that would be to use a pointer to the std::vector<...> and an index. That means that push_back only will affect the end iterator.

To get around invalidation of the end iterator, one simple way would be to iterpret the index value of -1 to always refer to end(), and then writing a custom comparison operator to account for that.

template<typename T>
class Iterator
{
    ......
    bool operator==(const Iterator& rhs)
    {
        if (conatiner != rhs.container)
            return false;
        return index == rhs.index || isEnd() && rhs.isEnd();
    }
    bool isEnd() const
    {
        return index == -1 || index == container->size();
    }
    std::vector<T>* container;
    int index;
};

Upvotes: 0

Nicol Bolas
Nicol Bolas

Reputation: 473272

Each range adaptor or view has its own rules for how it interacts with the underlying range. And the standard spells out these interactions, if sometimes obliquely.

ranges::ref_view for example is explicitly stated to work as if it holds a pointer to the range. And its begin/end functions behave as if they call that range's begin/end functions, as well as forwarding any other functionality to the range it was given. So its interactions are pretty clear, since its iterators are the exact same type as those of the underlying range.

For a range like ranges::filter_view, it's a bit more difficult to track down. filter_view's iterator behavior is based on the behavior of filter_iterator. That type behaves as if it stores an iterator to the underlying range (due to the exposition-only iterator member). So a filter_iterator will be invalidated whenever the underlying range's iterators are. And there is no exposition member of filter_view that holds an iterator, so you might expect that calling begin will always get a fresh, valid filter_iterator.

But it won't. There is a major caveat buried in the description of filter_view::begin. A semantic component of the behavior of a range's begin function is that it must execute in amortized constant time. Finding the start of a filtered list is a linear-time operation. Therefore, filter_view::begin is required to only do this linear-time operation once and then internally cache the result.

Which means it does store an iterator, even if it's not clear that it does. So if you invalidate whatever the begin filter_iterator is using, you have invalidated the filter_range's begin iterator and must construct a new one.

In summary, if you want to know the iterator invalidation behavior for a view, you must read through the entire description of that view. But this is true of containers as well; there isn't a nice, neat section of the standard which spells out exactly when iterators, references, and pointers are invalidated. Cppreference has a nice list, but the standard leaves it up to the definition of each function in the class.

Upvotes: 0

Nikolai Kozlov
Nikolai Kozlov

Reputation: 51

I'm currently playing with C++20 ranges and while implementing my own view I came up with the same question: what are the rules for iterator invalidation for views?

As far as I can see, ranges heavily use meta-programming and under the hood they construct a hierarchy of state machines. Actual types of these state machines are often hidden [1], so we may have difficulties to assume their restrictions. Iterator invalidation is a part of these restrictions, so to specify when and how iterators are invalidated when we construct a view hierarchy can be quite challenging if not impossible. Even if we manage to describe these rules they could be impossible to memorise, not to mention to use efficiently.

Ranges V3 library has the following recommendation:

View validity

Any operation on the underlying range that invalidates its iterators or sentinels will also invalidate any view that refers to any part of that range. Additionally, some views (e.g., views::filter), are invalidated when the underlying elements of the range are mutated. It is best to recreate a view after any operation that may have mutated the underlying range.

https://github.com/ericniebler/range-v3/blob/master/doc/index.md

The restriction given above just slashes away all concerns and although it is stricter than the rules for the standard containers it establishes a simple rule to memorise about any view iterators invalidation. At the same time it gives freedom to change a view implementation without touching that rule.

So, I suppose, it is safe to assume that ranges in C++20 standard are subject to the same limitations.

[1] My observation are based on MSVC implementation for ranges where range adapters can actually spawn different types based on strategies. So when you pipeline std::views::take(), for example, you may suddenly end up with std::span().

Upvotes: 1

Related Questions