Reputation: 4784
Although there are tens of questions about remove_if + erase for vector. I couldn't find what is the performance of such action. When I write:
myVector.erase(remove_if(myVector.begin(),
myVector.end(),
some_predicate), myVector.end());
The remove if will return an iterator to the last relevant item + 1 (let's call it X). I believe that this will happen in O(n).
But how the erase will work ?
Thanks.
Upvotes: 8
Views: 5856
Reputation: 171263
Consider this vector:
|0|1|2|3|4|5|6|7|8|9|
We use remove_if
to remove all elements that are multiples of 4:
std::remove_if(v.begin(), v.end(), [](auto i){ return i != 0 && !(i%4); });
This starts iterating through the vector with an iterator X until it finds an element where the predicate returns true:
|0|1|2|3|4|5|6|7|8|9|
X
This is the first element we want to remove.
Next it creates another iterator pointing to the next element, Y = X+1, and checks the predicate for *Y:
|0|1|2|3|4|5|6|7|8|9|
X Y
The predicate is false, so we want to keep that element, so it assigns the next element to the element we want to remove, by doing *X = std::move(*Y)
:
|0|1|2|3|5|5|6|7|8|9|
X Y *X = std::move(*Y)
So we have two iterators, X and Y, where X points to the next element in the "output" (i.e. the elements we're not removing) and Y is the next element to consider removing.
We move both iterators to the next position, check the predicate for Y (which is false again), and do another assignment:
|0|1|2|3|5|6|6|7|8|9|
X Y *X = std::move(*Y)
Then it does the same again at the next position:
|0|1|2|3|5|6|7|7|8|9|
X Y *X = std::move(*Y)
And then it moves on, but finds that the predicate is true for Y
|0|1|2|3|5|6|7|7|8|9|
X Y
So it just increments Y, which skips that element and so doesn't copy it into the "output" position at X:
|0|1|2|3|5|6|7|7|8|9|
X Y
The predicate is not true for Y, so it assigns it to X:
|0|1|2|3|5|6|7|9|8|9|
X Y *X = std::move(*Y)
Then it increments X and Y again
|0|1|2|3|5|6|7|9|8|9|
X Y
Now Y is past-the-end so we return X (which points past-the-end of the output sequence, i.e. the elements we want to keep).
After the remove_if
returns X we call v.erase(X, v.end())
, so the vector invokes the destructors for each element from X to the end:
|0|1|2|3|5|6|7|9|~|~|
X end
And then sets the size so the vector ends at X:
|0|1|2|3|5|6|7|9|
end
After this v.capacity() >= v.size()+2
because the memory that was used by the two final elements is still present, but is not in use.
Upvotes: 27
Reputation: 223
Why not use a swap 'n pop approach? We fooled around a lot with optimizing erase in vectors and found this to be the fastest, as it has O(1)
complexity. Only downside is that it doesn't preserve order. Which is fine is a lot of cases. Here is the template method for such an operation:
template<typename T>
inline typename std::vector<T>::iterator unorderedErase(std::vector<T>& p_container,
typename std::vector<T>::iterator p_it)
{
if (p_it != p_container.end() - 1)
{
std::swap(*p_it, p_container.back());
p_container.pop_back();
return p_it;
}
// else
p_container.pop_back();
return p_container.end();
}
Upvotes: 0
Reputation: 29966
The complexity of vector::erase
is well-defined:
Linear: the number of calls to the destructor of T is the same as the number of elements erased, the assignment operator of T is called the number of times equal to the number of elements in the vector after the erased elements
The way it is going to work internally (i.e. how exactly is it going to remove your elements) is kind of irrelevant.
The complexity of remove_if
is also defined and is
Exactly std::distance(first, last) applications of the predicate.
So your code has linear complexity.
Upvotes: 8