Drxxd
Drxxd

Reputation: 1929

Using line profiler with multiprocessing

How can you profile a python module that use multiprocessing (multiprocessing.Pool.map) so each spawned process will be also profiled line by line.

Currently I use line_profiler for profiling but it doesn't support multiprocessing. Is there a way to do it manually? Or maybe use some other tool?

Upvotes: 16

Views: 2586

Answers (2)

9 Guy
9 Guy

Reputation: 309

My own solution was to just bypass the multiprocessing call itself, but this may not be valid in your scenario

if __name__ == "__main__":
    if len(sys.argv) >= 2 and sys.argv[1] == 'profile':
        import line_profiler

        def profile_function():
            lp = line_profiler.LineProfiler()
            lp.add_function(try_generate)
            # lp.add_function(.....)
            lp.run('try_generate(None)')
            lp.print_stats()
            lp.dump_stats('main.py.lprof')

        profile_function()
    else:
        # This function starts a concurrent.futures.ProcessPoolExecutor
        # that spawns multiple threads of try_generate
        main()  

Upvotes: 0

Chris
Chris

Reputation: 2860

The normal way of using line_profiler of adding @profile to the function being profiled and running kernprof -v -l script.py leads to the following error for multiprocessing:

Can't pickle <class '__main__.Worker'>: attribute lookup Worker on __main__ failed.

To fix this, we have to setup the line_profiler ourselves in the sub-process we want to profile, rather than doing it globally via kernelprof:

import multiprocessing as mp
import line_profiler

class Worker(mp.Process):

    def run(self):
        prof = line_profiler.LineProfiler()
        # Wrap all functions that you want to be profiled in this process
        # These can be global functions or any class methods
        # Make sure to replace instance methods on a class level, not the bound methods self.run2
        Worker.run2 = prof(Worker.run2)
        ...
        # run the main
        self.run2()
        # store stats in separate file for each process
        prof.dump_stats('worker.lprof')

    def run2(self):
        # real run method renamed
        ...

Now running the script this generates a profile file that we can then visualize with:

python -m line_profiler worker.lprof

Upvotes: 4

Related Questions