Reputation: 1184
My goal is to find a group in a string using regex
and replace it with a space.
The group I am looking to find is a group of symbols only when they fall between strings. When I use re.findall()
it works exactly as expected
word = 'This##Is # A # Test#'
print(word)
re.findall(r"[a-zA-Z\s]*([\$\#\%\!\s]*)[a-zA-Z]",word)
>>> ['##', '# ', '# ', '']
But when I use re.sub()
, instead of replacing the group, it replaces the entire regex.
x = re.sub(r"[a-zA-Z\s]*([\$\#\%\!\s]*)[a-zA-Z]",r' ',word)
print(x)
>>> ' #'
How can I use regular expressions to replace ONLY the group? The outcome I expect is:
'This Is A Test#'
Upvotes: 1
Views: 83
Reputation: 43169
First, there's no need to escape every "magic" character within a character class, [$#%!\s]*
is equally fine and much more readable.
Second, matching (i.e. retrieving) is different from replacing and you could use backreferences to achieve your goal.
Third, if you only want to have #
at the end, you could help yourself with a much easier expression:
(?:[\s#](?!\Z))+
Which would then need to be replaced by a space, see a demo on regex101.com.
Python
this could be:
import re
string = "This##Is # A # Test#"
rx = re.compile(r'(?:[\s#](?!\Z))+')
new_string = rx.sub(' ', string)
print(new_string)
# This Is A Test#
Upvotes: 1
Reputation: 189377
The problem is that your regex matches the wrong thing entirely.
x = re.sub(r'\b[$#%!\s]+\b', ' ', word)
Upvotes: 0
Reputation: 106543
You can group the portions of the pattern you want to retain and use backreferences in your replacement string instead:
x = re.sub(r"([a-zA-Z\s]*)[\$\#\%\!\s]*([a-zA-Z])", r'\1 \2', word)
Upvotes: 0