xralf
xralf

Reputation: 3632

Regex matching Python function calls

I'd like to create a regular expression in Python that will match against a line in Python source code and return a list of function calls.

The typical line would look like this:

something = a.b.method(time.time(), var=1) + q.y(x.m())

and the result should be:

["a.b.method()", "time.time()", "q.y()", "x.m()"]

I have two problems here:

  1. creating the correct pattern
  2. the catch groups are overlapping

thank you for help

Upvotes: 3

Views: 4587

Answers (5)

OmarSSelim
OmarSSelim

Reputation: 41


I have an example for you proving this is doable in Python3

    import re


    def parse_func_with_params(inp):
        func_params_limiter = ","
        func_current_param = func_params_adder = "\s*([a-z-A-Z]+)\s*"

        try:
            func_name = "([a-z-A-Z]+)\s*"
            p = re.compile(func_name + "\(" + func_current_param + "\)")
            print(p.match(inp).groups())
        except:
            while 1:
                func_current_param += func_params_limiter + func_params_adder
                try:
                    func_name = "([a-z-A-Z]+)\s*"
                    p = re.compile(func_name + "\(" + func_current_param + "\)")
                    print(p.match(inp).groups())
                    break
                except:
                    pass

Command line Input: animalFunc(lion, tiger, giraffe, singe)
Output: ('animalFunc', 'lion', 'tiger', 'giraffe', 'singe')

As you see the function name is always the first in the list and the rest are the paramaters names passed


Upvotes: 0

georg
georg

Reputation: 214949

I don't think regular expressions is the best approach here. Consider the ast module instead, for example:

class ParseCall(ast.NodeVisitor):
    def __init__(self):
        self.ls = []
    def visit_Attribute(self, node):
        ast.NodeVisitor.generic_visit(self, node)
        self.ls.append(node.attr)
    def visit_Name(self, node):
        self.ls.append(node.id)


class FindFuncs(ast.NodeVisitor):
    def visit_Call(self, node):
        p = ParseCall()
        p.visit(node.func)
        print ".".join(p.ls)
        ast.NodeVisitor.generic_visit(self, node)


code = 'something = a.b.method(foo() + xtime.time(), var=1) + q.y(x.m())'
tree = ast.parse(code)
FindFuncs().visit(tree)

result

a.b.method
foo
xtime.time
q.y
x.m

Upvotes: 13

kev
kev

Reputation: 161604

$ python3
>>> import re
>>> from itertools import chain
>>> def fun(s, r):
...     t = re.sub(r'\([^()]+\)', '()', s)
...     m = re.findall(r'[\w.]+\(\)', t)
...     t = re.sub(r'[\w.]+\(\)', '', t)
...     if m==r:
...         return
...     for i in chain(m, fun(t, m)):
...         yield i
...
>>> list(fun('something = a.b.method(time.time(), var=1) + q.y(x.m())', []))
['time.time()', 'x.m()', 'a.b.method()', 'q.y()']

Upvotes: 4

Qtax
Qtax

Reputation: 33908

I don't really know Python, but I can imagine that making this work properly involves some complications, eg:

  • strings
  • comments
  • expressions that return an object

But for your example, an expression like this works:

(?:\w+\.)+\w+\(

Upvotes: 1

Evan Davis
Evan Davis

Reputation: 36592

/([.a-zA-Z]+)\(/g

should match the method names; you'd have to add the parens after since you have some nested.

Upvotes: 2

Related Questions