Infiltrator
Infiltrator

Reputation: 1681

Keeping a valid vector::iterator after erase()

EDIT: I have had a lot of answers telling me that I should separate the deletion into another loop. Perhaps I did not make it clear enough, but I stated in my last paragraph that I'd like to find a solution OTHER than that. ie keeping the current code structure, but using some little-known C++fu to make it work.

Well, I know that calling erase() on a vector invalidates iterators to the element and all those after it, and that erase() returns an iterator to the next valid iterator, but what if the erase happens elsewhere?

I have the following situation (simplified):

WARNING: Do NOT assume that this is the entire code. What is shown below is EXTREMELY simplified to illustrate my problem. All the classes and methods shown below are actually far more complex.

class Child {
   Parent *parent;
}

class Parent {
   vector<Child*> child;
}

void Parent::erase(Child* a) {
   // find an iterator, it, that points to Child* a
   child.erase(it);
}

int Child::update() {
   if(x()) parent.erase(*this) // Sometimes it will; sometimes (most) it won't
   return y;
}

void Parent::update() {
   int i = 0;
   for(vector<A>::iterator it = child.begin(); it != child.end(); it++)
      i += (*it)->update();
}

So, obviously, it will crash after it runs (*it)->update() if x() returns true, because when it does, the Child will tell the Parent to remove it from the vector, invalidating the iterator.

Is there any way of fixing this other than making Parent::erase() pass an iterator all the way back up to Parent::update()? This would be problematic, as it is not called for every call to Child::update(), and thus that function would need a way to return an iterator to itself every single other time, and it is also currently returning another value. I would also prefer to avoid some other similar way to separate the erasing the process from the updating loop.

Upvotes: 11

Views: 9441

Answers (8)

Macke
Macke

Reputation: 25680

If you can communicate both erase-intent and id out of your update-function, then you can do it like this:

std::tuple<int, bool> Child::update() {
   auto erase = x();
   return {y, erase};
}

void Parent::update() {
   int i = 0;

   for(vector<A>::iterator it = child.begin(); it != child.end();) {
      auto [y, erase] += (*it)->update();
      i += y;

      if (erase) {
         it = child.erase(it); // erase returns next iterator
      } else {
         ++it;
      }
   }
}

Upvotes: 4

Thomas Young
Thomas Young

Reputation: 195

One basis for a solution (as others have said) is to switch from std::vector to std::list, since you can then erase nodes from a list without invalidating references to other nodes.

But vectors tend to have a lot better performance than lists, due to much better locality of reference and also add a lot of memory overhead (in the per node prev and next pointers, but also in the form of overhead per allocated block in your system allocator).

What I did, then, with some similar requirements, is to stick with a vector, but to allow holes or 'dead entries' in your vector when elements are deleted.

You'll need to be able to detect and skip dead entries in your vector during in-order iteration, but you can collapse the vector to remove these dead entries periodically (or just after the specific iteration loop that deletes elements has completed). You can also (if necessary) include a freelist for reusing dead entries for new children.

I describe this setup in more detail in the following blog post:: http://upcoder.com/12/vector-hosted-lists/

A couple of other setups that I talk about in that blog post that could be relevant to these kinds of requirements are a 'vector hosted' linked list and a linked list with custom memory pool.

Upvotes: 2

Charles L Wilcox
Charles L Wilcox

Reputation: 1124

I'm unsure of all you're design constraints / goals, but you could expose the need to delete a child through it's public API and just make Parent::update conditionally do the delete.

void Parent::update()
{
    int i = 0;
    for( vector<A>::iterator it = child.begin();
         it != child.end(); /* ++it done conditionally below */ )
    {
        i += (*it)->update();
        if( (*it)->should_erase() )
        {
            it = child.erase( it );
        }
        else
        {
            ++it;
        }
    }
}

bool Child::should_erase() const
{
    return x();
}

int Child::update()
{
    // Possibly do other work here.
    return y;
}

Then perhaps you can remove Parent::erase.

Upvotes: 2

Macke
Macke

Reputation: 25680

You can't really iterate over and mutate a std::vector at the same time unless there's some communication between the iteration that the mutation.

I've seen other, non-standard, containers facilatate this through "smart" iterators that know when their value has been erased (and maybe auto-jump to the next item). It's quite a bit more book-keeping though.

Upvotes: 5

b.buchhold
b.buchhold

Reputation: 3896

I think the problem is, that your method "update" also invalidates iterators in the exact same way that std::vector::erase does. So I recommend you just do the exact same thing: return the new, valid iterator and apdapt the calling loop accordingly.

Upvotes: 0

Nim
Nim

Reputation: 33655

try the modified version below:

   for(vector<A>::iterator it = child.begin(); it != child.end();)
      i += (*it++)->update();

EDIT: This is of course horribly broken, however it would work if you could change the container to std::list. Without that change, you could try a reverse iterator (if order did not matter), i.e.

   for(vector<A>::reverse_iterator it = child.rbegin(); it != child.rend();)
      i += (*it++)->update();

Upvotes: 0

Bj&#246;rn Pollex
Bj&#246;rn Pollex

Reputation: 76788

I recommend that you restructure your code to not mix the two different actions of updating (by deleting certain elements) and aggregating (by adding up the values) the data.

You could do this by changing the return-value of Child::update to something like std::pair<int, bool>, where the int is the value and the bool indicates if this element should be deleted.

If you can make Child::update a const method (meaning it does not modify the object, and only calls other const methods), you could write a simple functor that you can use with std::remove_if. Something like this:

class update_delete {
public:
    update_delete() : sum(0) {}
    bool operator()(const Child & child) {
        std::pair<int, bool> result = child.update();
        sum += result.first;
        return result.second;
    }
private:
    int sum;
}

If you cannot make update it const, just swap the element with some element from the back (you would have to keep an iterator that always points to the last element that is available for swapping). When you aggregation is done, just discard the end of the vector (that now contains all the elements that are to be deleted) using vector::resize. This is analogous to using std::remove_if, but I am not sure if it is possible/valid to use it with a predicate that modifies the objects in the sequence.

Upvotes: 5

Skurmedel
Skurmedel

Reputation: 22149

How about adding the children to delete into a list, and delete them after you've updated every child.

In effect you'll defer deletion until after the first loop.

Upvotes: 2

Related Questions