Reputation: 1591
I wrote a function in c++ to remove parenthesis from a string, but it doesn't always catch them all for some reason that I'm sure is really simple.
string sanitize(string word)
{
int i = 0;
while(i < word.size())
{
if(word[i] == '(' || word[i] == ')')
{
word.erase(i,1);
}
i++;
}
return word;
}
Sample result:
Input: ((3)8)8)8)8))7
Output: (38888)7
Why is this? I can get around the problem by calling the function on the output (so running the string through twice), but that is clearly not "good" programming. Thanks!
Upvotes: 0
Views: 6381
Reputation: 208323
while(i < word.size())
{
if(word[i] == '(' || word[i] == ')')
{
word.erase(i,1);
}
i++;
}
When you remove an element the next element is moved to that location. If you want to test it, you will have to avoid incrementing the counter:
while (i < word.size()) {
if (word[i] == '(' || word[i] == ')' ) {
word.erase(i,1);
} else {
++i;
}
}
That can also be done with iterators, but either option is bad. For each parenthesis in the string, all elements that are after it will be copied, which means that your function has quadratic complexity: O(N^2)
. A much better solution is use the erase-remove idiom:
s.erase( std::remove_if(s.begin(), s.end(),
[](char ch){ return ch==`(` || ch ==`)`; })
s.end() );
If your compiler does not have support for lambdas you can implement the check as a function object (functor). This algorithm has linear complexity O(N)
as the elements that are not removed are copied only once to the final location.
Upvotes: 4
Reputation: 881323
It's failing because your incrementing the index in all cases. You should only do that if you're not deleting the character, since the deletion shifts all the characters beyond that point back by one.
In other words, you'll have this problem wherever you have two or more consecutive characters to delete. Rather than deleting them both, it "collapses" the two into one.
Running it through your function twice will work on that particular input string but you'll still get into trouble with something like "((((pax))))" since the first call will collapse it to "((pax))" and the second will give you "(pax)".
One solution is to not advance the index when deleting a character:
std::string sanitize (std::string word) {
int i = 0;
while (i < word.size()) {
if(word[i] == '(' || word[i] == ')') {
word.erase(i,1);
continue;
}
i++;
}
return word;
}
However, I'd be using the facilities of the language a little more intelligently. C++ strings already have the capability to search for a selection of characters, one that's possibly far more optimised than a user loop. So you can use a much simpler approach:
std::string sanitize (std::string word) {
int spos = 0;
while ((spos = word.find_first_of ("()", spos)) != std::string::npos)
word.erase (spos, 1);
return word;
}
You can see this in action in the following complete program:
#include <iostream>
#include <string>
std::string sanitize (std::string word) {
int i = 0;
while ((i = word.find_first_of ("()", i)) != std::string::npos)
word.erase (i, 1);
return word;
}
int main (void) {
std::string s = "((3)8)8)8)8))7 ((((pax))))";
s = sanitize (s);
std::cout << s << '\n';
return 0;
}
which outputs:
388887 pax
Upvotes: 0
Reputation: 503
Why not just use strtok and a temporary string?
string sanitize(string word)
{
int i = 0;
string rVal;
char * temp;
strtok(word.c_str(), "()"); //I make the assumption that your values should always start with a (
do
{
temp = strtok(0, "()");
if(temp == 0)
{
break;
}
else { rVal += temp;}
}while(1);
return rVal;
}
Upvotes: -2
Reputation: 183868
if(word[i] == '(' || word[i] == ')')
{
word.erase(i,1);
}
i++;
If you erase a parenthesis, the next character moves to the index previously occupied by the parenthesis, so it is not checked. Use an else
.
if(word[i] == '(' || word[i] == ')')
{
word.erase(i,1);
} else {
i++;
}
Upvotes: 11