Reputation: 1122
I'm working on extracting the pattern def ([^\s]+)\([^\.]*\)
in Python. However, when I have multiline input, only the first occurrence is obtained. I have specific the re.MULTILINE
option on my Python regular expression but still to no avail. Lets say I have the following input:
def a():
pass
b()
def b():
pass
My regular expression only extracts the 'a' and doesn't continue and extract 'b'. The code I'm using is:
self.function_re = re.compile(r'def (\S+)\([^\.]*\)', re.MULTILINE)
print(self.function_re.findall(self.code))
Which outputs ['a']
.
Upvotes: 0
Views: 57
Reputation: 15854
It's because the \([^\.]*\)
part is greedy, ie. it matches the whole part from the first parenthesis down to the very last one:
>>> r = re.compile(r'def ([^\s]+)(\([^\.]*\))')
>>> r.findall(test)
[('a', '():\n pass\nb()\ndef b()')]
If you make it non-greedy by appending the ?
to the star, it should be all fine:
>>> r = re.compile(r'def ([^\s]+)\([^\.]*?\)')
>>> r.findall(test)
['a', 'b']
Upvotes: 0
Reputation:
I'm guessing your pattern for the parameter list is too greedy, and matches all the way up to the last closing parenthesis in the string. Try using def (\S+)\([^\.]*?\)
(note the ?
qualifier after the "zero or more" quantifier for your parameter list).
Upvotes: 2