Reputation: 697
I'm tasked with taking a string, finding all instances of two different types of matches in that string, and performing a similar-but-different replacement on each match of each type, all using a single RegEx and a single pass through re.sub()
Specifically I'm looking for any <
or <=
and replacing them with >
and >=
respectively. Each comparison operator in need of replacement is between two words as defined by \w*
and zero or more spaces \s*
on either side.
I have found a regular expression that finds all necessary matches and lumps them into useful groups:
((\b\w*(\s*<\s*)\w*\b)|(\b\w*(\s*<=\s*)\w*\b))+
This will parse the string such that all comparisons that meet the search criteria are matched, and that all <
will be in match group \3
and all <=
will be in match group \5
My question is this: Is there a way to replace all \3
with ' > '
and all \5
with ' >= '
in a single call to re.sub()
? I've read through the documentation for the sub
method in python re
but haven't been able to find a way, perhaps due to my limited familiarity with the syntax and behavior of the whole system.
I am allowed and expected to compile the regex separately before the substitution and so the final set up will look something like this:
r1 = re.compile(r"((\b\w*(\s*<\s*)\w*\b)|(\b\w*(\s*<=\s*)\w*\b))+")
subStr = r" ??? "
r1.sub( ???, subStr ??? )
Here is some example input/output:
input string :
"v1 < v2 v3 <= v4 v5 > v6 v7 >= v8"
running the substitution would produce:
"v1 > v2 v3 >= v4 v5 > v6 v7 >= v8"
plugging my pattern and the input string into https://regex101.com/ for python, will show how my pattern matches the input string in the way I described.
Upvotes: 1
Views: 492
Reputation: 89557
You only have to make the = optional and to capture parts around the <:
re.sub(r'\b(?<=\w)(\s*)<(=?\s*\w)', r'\1>\2', s)
for efficiency reasons I started the pattern with the word boundary \b
, the following lookbehind (?<=\w)
ensures there's at least one word character.
Upvotes: 3