Reputation: 1499
I'm trying to match multiple patterns using regex sub grouping and replace the match with an asterisk for a data file that has similar format to the string below. However, I am getting only the desired results for the first match. The subsequent matches are consuming string that I did not expect. Is there a better approach to getting the desired output below?
import re
myString = '-fruit apple -number 123 -animal cat -name bob'
match = re.compile('(-fruit\s+)(\w+)|'
'(-animal\s+)(cat)|'
'(-name\s+)(bob)')
print(match.sub('\g<1>*', myString))
Current Output:
-fruit * -number 123 * *
Desired Output:
-fruit * -number 123 -animal * -name *
Upvotes: 2
Views: 995
Reputation: 8413
Alternation does not reset the group numbers, thus your groups are numbered like (1)(2)|(3)(4)|(5)(6)
but you do only reinsert group 1 - but should do so for groups 3 and 5 too. As non-matched groups are treated as empty string when replacing, you can simply add them to your pattern like \g<1>\g<2>\g<3>*
.
On a sidenote I would recommend using raw strings when working with regex patterns (r'pattern'
), so you do not have to wonder where to double backslash (e.g. \\b
).
Upvotes: 3