bsam
bsam

Reputation: 930

Weird behaviour with threads and processes mixing

I'm running the following python code:

import threading
import multiprocessing

def forever_print():
    while True:
        print("")

def main():
    t = threading.Thread(target=forever_print)
    t.start()
    return


if __name__=='__main__':
    p = multiprocessing.Process(target=main)
    p.start()
    p.join()
    print("main process on control")

It terminates.

When I unwrapped main from the new process, and just ran it directly, like this:

if name == '__main__':
    main()

The script went on forever, as I thought it should. Am I wrong to assume that, given that t is a non-daemon process, p shouldn't halt in the first case?

I basically set up this little test because i've been developing an app in which threads are spawned inside subprocesses, and it's been showing some weird behaviour (sometimes it terminates properly, sometimes it doesn't). I guess what I wanted to know, in a broader sense, is if there is some sort of "gotcha" when mixing these two python libs.

My running environment: python 2.7 @ Ubuntu 14.04 LTS

Upvotes: 3

Views: 839

Answers (3)

dano
dano

Reputation: 94901

The gotcha is that the multiprocessing machinery calls os._exit() after your target function exits, which violently kills the child process, even if it has background threads running.

The code for Process.start() looks like this:

def start(self):
    '''
    Start child process
    '''
    assert self._popen is None, 'cannot start a process twice'
    assert self._parent_pid == os.getpid(), \
           'can only start a process object created by current process'
    assert not _current_process._daemonic, \
           'daemonic processes are not allowed to have children'
    _cleanup()
    if self._Popen is not None:
        Popen = self._Popen
    else:
        from .forking import Popen
    self._popen = Popen(self)
    _current_process._children.add(self)

Popen.__init__ looks like this:

        def __init__(self, process_obj):
        sys.stdout.flush()
        sys.stderr.flush()
        self.returncode = None

        self.pid = os.fork()  # This forks a new process
        if self.pid == 0: # This if block runs in the new process
            if 'random' in sys.modules:
                import random
                random.seed()
            code = process_obj._bootstrap()  # This calls your target function
            sys.stdout.flush()
            sys.stderr.flush()
            os._exit(code)  # Violent death of the child process happens here

The _bootstrap method is the one that actually executes the target function you passed passed to the Process object. In your case, that's main. main returns right after you start your background thread, even though the process doesn't exit, because there's still a non-daemon thread running.

However, as soon execution hits os._exit(code), the child process is killed, regardless of any non-daemon threads still executing.

Upvotes: 2

Tim Peters
Tim Peters

Reputation: 70685

For now, threads created by multiprocessing worker processes act like daemon threads with respect to process termination: the worker process exits without waiting for the threads it created to terminate. This is due to worker processes using os._exit() to shut down, which skips most normal shutdown processing (and in particular skips the normal exit processing code (sys.exit()) that .join()'s non-daemon threading.Threads).

The easiest workaround is for worker processes to explicitly .join() the non-daemon threads they create.

There's an open bug report about this behavior, but it hasn't made much progress: http://bugs.python.org/issue18966

Upvotes: 2

noxdafox
noxdafox

Reputation: 15040

You need to call t.join() in your main function.

As your main function returns, the process gets terminated with both its threads.

p.join() blocks the main thread waiting for the spawned process to end. Your spawned process then, creates a thread but does not wait for it to end. It returns immediately thus trashing the thread itself.

If Threads share memory, Processes don't. Therefore, the Thread you create in the newly spawned process remains relegated to that process. The parent process is not aware of it.

Upvotes: 2

Related Questions