Reputation: 967
I have a list of lines. I'm writing a typical text modifying function that runs through each line in the list and modifies it when a pattern is detected.
I realized later in writing this type of functions that a pattern may repeat multiple times in the line.
For example, this is one of the functions I wrote:
def change_eq(string):
#inputs a string and outputs the modified string
#replaces (X####=#) to (X####==#)
#set pattern
pat_eq=r"""(.*) #stuff before
([\(\|][A-Z]+[0-9]*) #either ( or | followed by the variable name
(=) #single equal sign we want to change
([0-9]*[\)\|]) #numeric value of the variable followed by ) or |
(.*)""" #stuff after
p= re.compile(pat_eq, re.X)
p1=p.match(string)
if bool(p1)==1:
# if pattern in pat_eq is detected, replace that portion of the string with a modified version
original=p1.group(0)
fixed=p1.group(1)+p1.group(2)+"=="+p1.group(4)+p1.group(5)
string_c=string.replace(original,fixed)
return string_c
else:
# returns the original string
return string
But for an input string such as
'IF (X2727!=78|FLAG781=0) THEN PURPILN2=(X2727!=78|FLAG781=0)*X2727'
, group() only works on the last pattern detected in the string, so it changes it to
'IF (X2727!=78|FLAG781=0) THEN PURPILN2=(X2727!=78|FLAG781==0)*X2727'
, ignoring the first case detected. I understand that's the product of my function using the group attribute.
How would I address this issue? I know there is {m,n}, but does it work with match?
Thank you in advance.
Upvotes: 0
Views: 56
Reputation: 896
Different languages handle "global" matches in different ways. You'll want to use Python's re.finditer
(link) and use a for loop to iterate through the resulting match objects.
Example with some of your code:
p = re.compile(pat_eq, re.X)
string_c = string
for match_obj in p.finditer(string):
original = match_obj.group(0)
fixed = p1.group(1) + p1.group(2) + '==' + p1.group(4) + p1.group(5)
string_c = string_c.replace(original, fixed)
return string_c
Upvotes: 1