Reputation: 1469
I have a vector and I am searching for an element in it while iterating over the vector with a for-each loop. If I find any invalid elements during the search, I would like to remove them from the vector.
Basically, I want to do something like this:
for (auto el : vec) {
if (el == whatImLookingFor) {
return el;
} else if (isInvalid(el)) {
vec.erase(el);
}
}
I looked at some other questions like this and this, but both recommend using std::remove_if
. That will iterate over the whole vector and remove all invalid elements, instead of iterating only until I find the element I'm looking for, and ignoring any elements after that.
What would be a good way to do this?
Upvotes: 12
Views: 2176
Reputation: 338
I was curious about the performance of this, so I ran a quick naive benchmark comparing Benjamin's find then partial clean and hnefatl's for-loop. Benjamin's is indeed faster: 113x faster. Impressive.
But I was curious where most of the computation was going, as it was greater than the sum of remove_if
and find
, which are the only functions that actually iterate through the array. As it turns out, however, the vec.erase
line in his code is actually pretty slow. This is because in remove_if
he is cleaning the area from the start to the location of the found value, which results in a gap in the middle of the array from the invalid values. vec.erase
must then copy the remaining values to fill the gap and finally resize the array.
I ran a third test with a full-sized remove_if
/vec.erase
followed by a find
, so the gap happens at the end and no copy is necessary, just truncation. It turned out to be about 15% faster and leaves the whole vector clean. Note that this assumes that testing for validity is cheap. If it's more than a few simple comparisons, Benjamin's answer will win hands-down, as pointed out by Chris in the comments.
auto p = std::remove_if(vec.begin(), vec.end(), isInvalid);
vec.erase(p, vec.end());
return std::find(vec.begin(), vec.end(), whatImLookingFor);
Upvotes: 8
Reputation: 103751
You should still use std::remove_if
, just call std::find
beforehand.
auto el = std::find(vec.begin(), vec.end(), whatImLookingFor);
auto p = std::remove_if(vec.begin(), el, isInvalid);
// returns the iterator, not the element itself.
// if the element is not found, el will be vec.end()
return vec.erase(p, el);
This will usually be more efficient than removing one element at a time.
Upvotes: 20
Reputation: 15334
As @BenjaminLindley and @JMerdich pointed out, for the problem stated, a two-pass approach is probably simpler and more efficient.
In a realistic situation, it is possible that there is some expensive calculation that needs to be done either to determine if an element is invalid or to determine if an element is the one we are looking for:
In which case a two-pass approach becomes less desirable because it causes us to do this expensive calculation twice.
But it is possible to do a single-pass approach without calling std::vector::erase
multiple times within the loop. It is not too hard to write std::remove_if
yourself, then we can get it to do both checks at the same time. We still just call std::vector::erase
once at the end:
std::vector<T>::iterator
findAndRemoveInvalid(std::vector<T>& vec, U whatImLookingFor) {
// Find first invalid element - or element you are looking for
auto first = vec.begin();
for(;first != vec.end(); ++first) {
auto result = someExpensiveCalculation(*first);
if (result == whatImLookingFor)
return first;
if (isInvalid(result))
break;
}
if (first == vec.end())
return first;
// Find subsequent valid elements - or element you are looking for
auto it = first + 1;
for(;it != vec.end(); it++) {
auto result = someExpensiveCalculation(*it);
if (result == whatImLookingFor)
break;
if (!isInvalid(result)) {
*first++ = std::move(*it); // shift valid elements to the start
continue;
}
}
// erase defunct elements and return iterator to the element
// you are looking for, or vec.end() if not found.
return vec.erase(first, it);
}
It is clearly more complicated though, so measure performance first.
Upvotes: 2
Reputation: 6037
This is an intuitive approach to the problem that keeps the looping structure - while it only performs a single pass, due to repeated erasing it's likely to be less efficient than a two-pass approach. For this more efficient approach, you should use Benjamin Lindley's answer.
Modifying the code in the answer to the first link you gave, it looks like something like this would fit your specification:
for (auto i = vec.begin(); i != vec.end();)
{
if (*i == whatImLookingFor)
return i;
else if (isInvalid(*i))
i = vec.erase(i); // slow, don't use this version for real
else
++i;
}
erase
.You'll still need to handle the case where the element's not found, probably by returning vec.end()
.
Upvotes: 7
Reputation: 17444
If a simple loop, exited with break
is too primitive for you, I'd suggest using std::find()
to get an iterator to the searched element and then to use vector.erase()
to delete the othere elements.
Upvotes: 1