Reputation: 293
I am trying to use regex to remove white spaces in the sequence of consecutive '?' and/or '!' in a string. One example is that "what is that ?? ? ? ?? ??? ? ! ! ! ? !" should be changed to "what is that ??????????!!!?!". That is, I want to concatenate all '?' and '!' without space in between. My current code doesn't work out well:
import re
s = "what is that ?? ? ? ?? ??? ? ! ! ! ? !"
s = re.sub("\? +\?", "??", s)
s = re.sub("\? +\!", "?!", s)
s = re.sub("\! +\!", "!!", s)
s = re.sub("\! +\?", "!?", s)
which produces 'what is that ??? ???????!! !?!', where some spaces are obviously not deleted. what is going wrong in my code and how to revise it?
Upvotes: 1
Views: 5275
Reputation: 13176
My approach involves splitting the string into two and then handling the problem area with regex (removing spaces) and then joining the pieces back together.
import re
s = "what is that ?? ? ? ?? ??? ? ! ! ! ? !"
splitted = s.split('that ') # don't forget to add back in 'that' later
splitfirst = splitted[0]
s = re.sub("\s+", "", splitted[1])
finalstring = splitfirst+'that '+s
print(finalstring)
output:
╭─jc@jc15 ~/.projects/tests
╰─$ python3 string-replace-question-marks.py
what is that ??????????!!!?!
Upvotes: 0
Reputation:
If you want as @g.d.d.c said and sentence pattern is same then then you can try this :
string_="what is that ?? ? ? ?? ??? ? ! ! ! ? !"
string_1=[]
symbols=[]
string_1.append(string_[:string_.index('?')])
symbols.append(string_[string_.index('?'):])
string_1.append("".join(symbols[0].split()))
print("".join(string_1))
output:
what is that ??????????!!!?!
Upvotes: 0
Reputation: 47988
You're simply trying to condense whitespace around the punctuation, yeah? How about something like this:
>>> import re
>>> s = "what is that ?? ? ? ?? ??? ? ! ! ! ? !"
>>>
>>> re.sub('\s*([!?])\s*', r'\1', s)
'what is that??????????!!!?!'
If you're really interested in why your approach isn't working, it has to do with how regular expressions move through a string. When you write re.sub("\? +\?", "??", s)
and run it on your string, the engine works through like this:
s = "what is that ?? ? ? ?? ??? ? ! ! ! ? !"
# first match -----^^^
# internally, we have:
s = "what is that ??? ? ?? ??? ? ! ! ! ? !"
# restart scan here -^
# next match here ----^^^
# internally:
s = "what is that ??? ??? ??? ? ! ! ! ? !"
# restart scan here ---^
# next match here ------^^^
And so on. There are ways you can prevent the cursor from advancing as it's checking for a match (check out positive look-ahead).
Upvotes: 4