Reputation: 1929
How can you profile a python module that use multiprocessing (multiprocessing.Pool.map) so each spawned process will be also profiled line by line.
Currently I use line_profiler for profiling but it doesn't support multiprocessing. Is there a way to do it manually? Or maybe use some other tool?
Upvotes: 16
Views: 2586
Reputation: 309
My own solution was to just bypass the multiprocessing call itself, but this may not be valid in your scenario
if __name__ == "__main__":
if len(sys.argv) >= 2 and sys.argv[1] == 'profile':
import line_profiler
def profile_function():
lp = line_profiler.LineProfiler()
lp.add_function(try_generate)
# lp.add_function(.....)
lp.run('try_generate(None)')
lp.print_stats()
lp.dump_stats('main.py.lprof')
profile_function()
else:
# This function starts a concurrent.futures.ProcessPoolExecutor
# that spawns multiple threads of try_generate
main()
Upvotes: 0
Reputation: 2860
The normal way of using line_profiler of adding @profile
to the function being profiled and running kernprof -v -l script.py
leads to the following error for multiprocessing:
Can't pickle <class '__main__.Worker'>: attribute lookup Worker on __main__ failed.
To fix this, we have to setup the line_profiler
ourselves in the sub-process we want to profile, rather than doing it globally via kernelprof
:
import multiprocessing as mp
import line_profiler
class Worker(mp.Process):
def run(self):
prof = line_profiler.LineProfiler()
# Wrap all functions that you want to be profiled in this process
# These can be global functions or any class methods
# Make sure to replace instance methods on a class level, not the bound methods self.run2
Worker.run2 = prof(Worker.run2)
...
# run the main
self.run2()
# store stats in separate file for each process
prof.dump_stats('worker.lprof')
def run2(self):
# real run method renamed
...
Now running the script this generates a profile file that we can then visualize with:
python -m line_profiler worker.lprof
Upvotes: 4