Michael Stachowsky
Michael Stachowsky

Reputation: 787

Python 2.7 re search and findall providing different search results

I am trying to extract all function calls from a line of code and put then into a list of strings. For example, the string:

z = x + cos(x + y) - sin(x+2) + 3;

should be parsed into

['cos(x+y)','sin(x+2)']

Using python 2.7's re.search function and the regular expression

searchString = '([a-z]|[A-Z]|[0-9])+?[(].*?[)]'

I can extract the first function, cos(x+y), as expected.

When I use findall instead, I do get a list of two strings, but they contain only the characters just before the (. That is, I get ['s','n']

Since my regex works with search, what did I do wrong with findall?

The function I'm using is:

'''Separates out all function calls'''
def separateFunctionCalls(str):
    searchString = "([a-z]|[A-Z]|[0-9])+?[(].*?[)]"
    grp = re.findall(searchString,str)
    usingSearch = re.search(searchString,str)
    print usingSearch.group(0)
    print grp

And the test code is:

str = "return 2*cos(x+y) + sin(x+2)+1.0;"
separateFunctionCalls(str)

Upvotes: 1

Views: 150

Answers (1)

The fourth bird
The fourth bird

Reputation: 163517

Your pattern uses an alternation of character classes without a quantifier which will match only 1 of the listed items.

When you repeat a capturing group, the group contains the value of the last iteration, that is why you see those matches.

You could write the character class as a single one containing all the ranges and repeat that instead:

[a-zA-Z0-9]+\([^()]+\)

Regex demo

To match your values you might also match not a whitespace char or parenthesis, then match from the opening till closing parenthesis to get for example a bit broader match

[^\s()]+\([^()]+\)

Regex demo

Upvotes: 1

Related Questions