Reputation: 1643
I have the following code that takes a string and erases non alphabet characters
void removeNonAlpha(string& str){
for (string::iterator it = str.begin(); it < str.end(); it++){
if (!(isUpperCaseLetter(*it) || isLowerCaseLetter(*it) || str == ' '))
str.erase(it--);
}
}
I showed this to my professor and he told me that doing this is risky because it may invalidate the iterator that I'm using. However, I thought that erase will only invalidate iterators after the point of the erase, and I made sure not to use any iterators after that point. So could this code crash or cause any undefined behavior?
Upvotes: 1
Views: 2044
Reputation: 241691
std::vector::erase
works as you suggest; it only invalidates iterators starting with the first erased element. However, that doesn't apply to std::string
.
The C++ standard has traditionally been more flexible with the requirements for std::string
. (Or, in other words, it has traditionally allowed implementers to use optimizations which would not be valid for vectors.) And so it is with std::string::erase
, and other string mutators.
In [string.require]
(§21.4.1 of n3797), the standard accepts that:
basic_string
sequence may be invalidated by the following uses of that basic_string
object:
basic_string
as an argument.operator[]
, at
, front
, back
, begin
, rbegin
, end
, and rend
.In other words, calling a potentially mutating function like std::string::erase
could invalidate all iterators to that string, even if no visible modifications are made to the string (for example, because the range to be erased is empty).
(The latest draft C++ standard has the same wording, although it is now paragraph 4.)
In the first loop through the string, the iterator it
has the value str.begin()
. That iterator cannot be decremented, since the result would not be inside the string. And therefore, incrementing the decremented iterator might not return it
to str.begin()
for the next iteration.
None of the above applies to integer position indices. So if you could safely replace your loop with the very similar:
void removeNonAlpha(string& str){
for (auto sz = str.size(), i = 0; i < sz; ++i){
if (!(isUpperCaseLetter(str[i]) ||
isLowerCaseLetter(str[i]) ||
str[i] == ' '))
str.erase(i--);
}
}
Upvotes: 6