fgregg
fgregg

Reputation: 3249

Tracking down where builtin function is being called from in Python

When I cProfile a program I'm working on I see that I'm spending a lot of time making expensive zip calls. The code I've written isn't making these zip calls, so it must be in code that I'm bringing in from one of many libraries I've imported.

Is there a tool that will let me flag a function call, and be informed which functions called that function in Python?

Upvotes: 3

Views: 112

Answers (3)

Dave Kirby
Dave Kirby

Reputation: 26552

You can get that information from the profiler output. Create a Stats object from the output and call stats.print_callers('zip').

This should show you which functions called it, and for each caller how many times it was called and the total and cumulative times spent in the call.

Upvotes: 4

Bakuriu
Bakuriu

Reputation: 101909

A quick & dirty way of achieving what you want is pretty simple: replace the built-in zip with a custom function!

In [8]: import inspect
In [9]: def my_zip(*iterables):
   ...:     frame = inspect.currentframe().f_back
   ...:     my_zip.callees.append(frame.f_code.co_name)
   ...:     return my_zip.old_zip(*iterables)

In [10]: my_zip.callees = []

In [11]: my_zip.old_zip = zip

In [12]: import builtins

In [13]: builtins.zip = my_zip

In [14]: zip(range(5), range(4))
Out[14]: <builtins.zip at 0x7f06a2324290>

In [15]: zip.callees  # called at module level...
Out[15]: ['<module>']

(In python2 replace the builtins module with __builtin__).

A smarter implementation would use a collections.Counter and avoid keeping a list of all function names that called zip.

However you may need to know more than the name of the caller of zip. Maybe there is an other function f that calls g that calls h that calls zip, but to tell that f is the real cause of troubles you must relate h and f somehow.

It shouldn't be hard to extend my_zip to keep track of a bit more information though.


However, I believe that cProfile's output should provide enough timing information to more or less understand where the zips are called. In particular you should carefully study the cumtime column of its output.

You should also take into account the number of times the functions are called. Usually a function will always do the same number of zips when called, so the number of calls to zip and to the "culprit" function should be proportional in some way.

Surely there can be a lot of noise that hides these relationships but you should try to figure it out.

Upvotes: 0

Alvaro Fuentes
Alvaro Fuentes

Reputation: 17455

Some time ago I had a problem like yours and I managed to solve it. Its not a tool but maybe is useful for you. Just put the following code at the top of your main module.

Note: The code is custom-tailored for the zip function you need, for me it works (of course for another function, I don't remember the name now). The extra imports I did was of third party modules in my application, I don't know if it works for modules in the standard library.

import inspect

def trackZip(*args):
    print inspect.getouterframes(inspect.currentframe())[1]
    return __builtins__.zip(*args)

zip = trackZip

#do more imports...

def test():
    zip([1,2],[3,4])

test()
zip([1,2],[3,4])

Output:

(<frame object at 0x1948260>, '/home/user/untitled4.py', 17, 'test', ['    zip([1,2],[3,4])\n'], 0)
(<frame object at 0x11fd3d0>, '/home/user/untitled4.py', 20, '<module>', ['zip([1,2],[3,4])\n'], 0)

Upvotes: 0

Related Questions