Reputation: 787
I am trying to extract all function calls from a line of code and put then into a list of strings. For example, the string:
z = x + cos(x + y) - sin(x+2) + 3;
should be parsed into
['cos(x+y)','sin(x+2)']
Using python 2.7's re.search
function and the regular expression
searchString = '([a-z]|[A-Z]|[0-9])+?[(].*?[)]'
I can extract the first function, cos(x+y)
, as expected.
When I use findall
instead, I do get a list of two strings, but they contain only the characters just before the (
. That is, I get ['s','n']
Since my regex works with search
, what did I do wrong with findall
?
The function I'm using is:
'''Separates out all function calls'''
def separateFunctionCalls(str):
searchString = "([a-z]|[A-Z]|[0-9])+?[(].*?[)]"
grp = re.findall(searchString,str)
usingSearch = re.search(searchString,str)
print usingSearch.group(0)
print grp
And the test code is:
str = "return 2*cos(x+y) + sin(x+2)+1.0;"
separateFunctionCalls(str)
Upvotes: 1
Views: 150
Reputation: 163517
Your pattern uses an alternation of character classes without a quantifier which will match only 1 of the listed items.
When you repeat a capturing group, the group contains the value of the last iteration, that is why you see those matches.
You could write the character class as a single one containing all the ranges and repeat that instead:
[a-zA-Z0-9]+\([^()]+\)
To match your values you might also match not a whitespace char or parenthesis, then match from the opening till closing parenthesis to get for example a bit broader match
[^\s()]+\([^()]+\)
Upvotes: 1