Him
Him

Reputation: 5551

How to ensure termination of a multiprocessing Pool?

I am using a Jupyter notebook to launch all manner of nonsense out of a multiprocessing.Pool. Unfortunately, sometimes my workers have errors, and I need to close the pool, and start over again.

So, I have a cell that has the line pool.close(). I then respawn a new pool: pool = Pool(n, maxtasksperchild=1), and proceed along my merry way.

However, .close()ing the pool didn't do what I thought it did, and I now have a gazillion zombies on my machine. Worse, I overwrote the pool variable so I have no way to close them except by manually issuing kill commands. Worse still, when I issue a kill command for one of the zombies, a new one pops up in its place, causing me to suspect that pool.close() didn't, in fact, close the pool, the pool is still there, hidden somewhere, continuing even in death to execute its map_async command that will take 2 forevers to terminate.

In other words, pool.close() didn't close my pool.

I'm gonna sit here and kill stuff for an hour. In the meantime does anyone know how to:

pool.kill_all_of_the_processes_really_farill_this_time_and_prevent_them_from_ever_popping_back_up_under_any_circumstances_ever()


Here's a working example:

Cell 1

import multiprocessing as mp
def work(i):
    import time
    while True:
        time.sleep(0.01)

Cell2

try: 
    pool.close()
except: 
    pass
pool = mp.Pool(8, maxtasksperchild=1)
pool.map_async(work, range(10000000))

Rerun Cell2, and ps aux | grep python | wc -l to see that the number of processes open increases by 8

Upvotes: 2

Views: 1112

Answers (1)

Mark G
Mark G

Reputation: 250

As stated in the documentation:

close() Prevents any more tasks from being submitted to the pool. Once all the tasks have been completed the worker processes will exit.

Versus

terminate() Stops the worker processes immediately without completing outstanding work. When the pool object is garbage collected terminate() will be called immediately.

So you should be using terminate(), if you want to actually kill all the processes and not wait for them.

Concerning your comment: You can try killing the processes with the following command (see here as well):

kill -9 $(ps aux | grep -v grep | grep "<your search string>" | awk '{print $2}')

For me the path of the python interpreter was unique because i started it in a virtualenv. You could use that to filter only the jupyter python processes.

Upvotes: 4

Related Questions