Qiu Yangfan
Qiu Yangfan

Reputation: 891

"subprocess.Popen().readline()" in multithreading python cannot return

For below code running in Win7, there are thread T1 and T2, T1 prints dir content in the original window, and T2 ping for 4 seconds in a new window.

import os
import sys
import logging
import subprocess
import threading

class T1 (threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)
    def run(self):
        proc = subprocess.Popen("dir", shell=True, stdout=subprocess.PIPE)
        for line in iter(proc.stdout.readline, ''):
            logging.debug(line)
        logging.info("HEREEEEEEEE")

class T2 (threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)
    def run(self):
        subprocess.Popen(["ping.exe", "-n", "4", "127.0.0.1"], creationflags=subprocess.CREATE_NEW_CONSOLE)
        logging.info("")

if __name__=='__main__':
    logger = logging.getLogger('root')
    FORMAT = "[TID:%(thread)d %(funcName)s L#%(lineno)s] %(message)s"
    logging.basicConfig(format=FORMAT, level=logging.DEBUG)

    t1 = T1()
    t2 = T2()
    t1.start()
    t2.start()
    t1.join()
    t2.join()

    sys.exit(0)

For the line logging.info("HEREEEEEEEE") in T1 thread, I think it should be printed right after the content of dir is printed.

What doesn't make sense to me: Why the line isn't printed immediately, but printed 4 seconds later when the thread T2 finished?

I wonder if it's related with file descriptor in multithreading.

Upvotes: 1

Views: 316

Answers (1)

Eryk Sun
Eryk Sun

Reputation: 34280

For Python 2 your code has a race condition that can leak the inheritable write end of the pipe that's created by the T1 thread into the ping.exe process that's created by the T2 thread. readline on a pipe won't return until the pipe closes, which requires all handles for the write end to be closed.

In this case, you can avoid the race condition by passing close_fds=True to Popen when creating the ping.exe process. This prevents it from inheriting inheritable handles, including the pipe handle from the overlapping call in the T1 thread.

In general, prior to Python 3.7, if you need to support concurrent Popen calls with overridden standard handles, then you'll need to wrap Popen with a function that synchronizes the call by acquiring a lock beforehand. Unfortunately this makes the Popen call a bottleneck in a multi-threaded process, but there's no simple alternative.


Background

In Unix, the close_fds parameter of Popen applies in the child process after fork. If it's true, then all non-standard file descriptors will be closed before calling exec, even those without the FD_CLOEXEC flag set. It does not, however, close the standard file descriptors -- stdin (0), stdout (1), and stderr (2).

In Windows, a handle can be flagged as inheritable (i.e. HANDLE_FLAG_INHERIT). By default, handles are not inheritable. If CreateProcess is called with bInheritHandles as true, then all inheritable handles are inherited by the child. Popen passes this as not close_fds, i.e. not closing file descriptors means to inherit handles [*].

In Windows, the stdin, stdout, and stderr parameters of Popen are used to explicitly set the child's standard handles in STARTUPINFO (i.e. hStdInput, hStdOutput, hStdError). If the standard handles aren't overridden explicitly, the parent's standard handles are implicitly inherited by console applications (e.g. python.exe), but not GUI applications (e.g. pythonw.exe). If set explicitly, the handles must be made inheritable, and bInheritHandles (i.e. not close_fds) must be true. This is the source of the race condition when another thread makes an overlapping call to CreateProcess that also inherits handles.

In Python 3, the frequency of this race condition is reduced by defaulting close_fds to true when the standard handles aren't overridden. In 3.7, it's mitigated further by passing the standard handles in the lpAttributeList field of STARTUPINFOEX. With this change, concurrent calls to Popen can override the standard handles without leaking handles. However, the handles still have to be made inheritable, so there's still a race condition with concurrent calls to other functions that inherit handles, such as os.system and os.spawnl.


[*] Note that, despite the parameter name, C 'file descriptors' are not actually inherited in Windows. While the C runtime can inherit file descriptors for system and the spawn / exec functions, its use of STARTUPINFO to implement this is undocumented. Thus Popen only inherits handles. When inheriting a non-standard handle, you need to pass the handle value to the child (e.g. via stdin, the command line, or an environment variable), which you can get via msvcrt.get_osfhandle. The child can open a new file descriptor for the inherited handle via msvcrt.open_osfhandle.

Upvotes: 1

Related Questions