John Zwinck
John Zwinck

Reputation: 249532

Python doctest hangs using ProcessPoolExecutor

This code runs fine under regular CPython 3.5:

import concurrent.futures

def job(text):
    print(text)

with concurrent.futures.ProcessPoolExecutor(1) as pool:
    pool.submit(job, "hello")

But if you run it as python -m doctest myfile.py, it hangs. Changing submit(job to submit(print makes it not hang, as does using ThreadPoolExecutor instead of ProcessPoolExecutor.

Why does it hang when run under doctest?

Upvotes: 5

Views: 4762

Answers (4)

daphtdazz
daphtdazz

Reputation: 8159

The problem is that importing a module acquires a lock (which lock depends on your python version), see the docs for imp.lock_held.

Locks are shared over multiprocessing so your deadlock occurs because your main process, while it is importing your module, loads and waits for a subprocess which attempts to import your module, but can't acquire the lock to import it because it is currently being imported by your main process.

In step form:

  1. Main process acquires lock to import myfile.py
  2. Main process starts importing myfile.py (it has to import myfile.py because that is where your job() function is defined, which is why it didn't deadlock for print()).
  3. Main process starts and blocks on subprocess.
  4. Subprocess tries to acquire lock to import myfile.py

=> Deadlock.

Upvotes: 7

Tarun Lalwani
Tarun Lalwani

Reputation: 146630

So I think the issue is because of your with statement. When you have below

with concurrent.futures.ProcessPoolExecutor(1) as pool:
    pool.submit(job, "hello")

It enforces the thread to be executed and closed then an there itself. When you run this as main process it works and gives time for thread to execute the job. But when you import it as a module then it doesn't give the background thread a chance and the shutdown on the pool waits for the work to be executed and hence a deadlock

So the workaround that you can use is below

import concurrent.futures

def job(text):
    print(text)

pool = concurrent.futures.ProcessPoolExecutor(1)
pool.submit(job, "hello")

if __name__ == "__main__":
    pool.shutdown(True)

This will prevent the deadlock and will let you run doctest as well as import the module if you want

Upvotes: 8

Eric
Eric

Reputation: 6066

This should actually be a comment, but it's too long to be one.

Your code fails if it's imported as a module too, with the same error as doctest. I get _pickle.PicklingError: Can't pickle <function job at 0x7f28cb0d2378>: import of module 'a' failed (I named the file as a.py).

Your lack of if __name__ == "__main__": violates the programming guidelines for multiprocessing: https://docs.python.org/3.6/library/multiprocessing.html#the-spawn-and-forkserver-start-methods

I guess that the child processes will also try to import the module, which then tries to start another child process (because the pool unconditionally executes). But I'm not 100% sure about this. I'm also not sure why the error you get is can't pickle <function>.

The issue here seems to be that you want the module to auto start a process on import. I'm not sure if this is possible.

Upvotes: 0

Udi
Udi

Reputation: 30532

doctest imports your module in order to process it. Try adding this to prevent execution on import:

if __name__ == "__main__":
    with concurrent.futures.ProcessPoolExecutor(1) as pool: 
        pool.submit(job, "hello")

Upvotes: 0

Related Questions