Prune
Prune

Reputation: 77837

Streaming read from subprocess

I need to read output from a child process as it's produced -- perhaps not on every write, but well before the process completes. I've tried solutions from the Python3 docs and SO questions here and here, but I still get nothing until the child terminates.

The application is for monitoring training of a deep learning model. I need to grab the test output (about 250 bytes for each iteration, at roughly 1-minute intervals) and watch for statistical failures.

Code: variations are commented out.

Parent

cmd = ["/usr/bin/python3", "zzz.py"]
# test_proc = subprocess.Popen(
test_proc = subprocess.run(
    cmd,
    stdout=subprocess.PIPE,
    stderr=subprocess.STDOUT
    )

out_data = ""
print(time.time(), "START")
while not "QUIT" in str(out_data):
    out_data = test_proc.stdout
    # out_data, err_data = test_proc.communicate()
    print(time.time(), "MAIN received", out_data)

Child (zzz.py)

from time import sleep
import sys

for _ in range(5):
    print(_, "sleeping", "."*1000)
    # sys.stdout.flush()
    sleep(1)

print("QUIT this exercise")

Despite sending lines of 1000+ bytes, the buffer (tested elsewhere as 2kb; here, I've gone as high as 50kb) filling doesn't cause the parent to "see" the new text.

What am I missing to get this to work?


Update with regard to links, comments, and iBug's posted answer:

Unless someone comes up with a canonical, nifty solution that makes these obsolete, I'll accept iBug's answer tomorrow.

Upvotes: 10

Views: 1633

Answers (1)

iBug
iBug

Reputation: 37227

subprocess.run always spawns the child process, and blocks the thread until it exits.

The only option for you is to use p = subprocess.Popen(...) and read lines with s = p.stdout.readline() or p.stdout.__iter__() (see below).

This code works for me, if the child process flushes stdout after printing a line (see below for extended note).

cmd = ["/usr/bin/python3", "zzz.py"]
test_proc = subprocess.Popen(
    cmd,
    stdout=subprocess.PIPE,
    stderr=subprocess.STDOUT
)

out_data = ""
print(time.time(), "START")
while not "QUIT" in str(out_data):
    out_data = test_proc.stdout.readline()
    print(time.time(), "MAIN received", out_data)
test_proc.communicate()  # shut it down

See my terminal log (dots removed from zzz.py):

ibug@ubuntu:~/t $ python3 p.py
1546450821.9174328 START
1546450821.9793346 MAIN received b'0 sleeping \n'
1546450822.987753 MAIN received b'1 sleeping \n'
1546450823.993136 MAIN received b'2 sleeping \n'
1546450824.997726 MAIN received b'3 sleeping \n'
1546450825.9975247 MAIN received b'4 sleeping \n'
1546450827.0094354 MAIN received b'QUIT this exercise\n'

You can also do it with a for loop:

for out_data in test_proc.stdout:
    if "QUIT" in str(out_data):
        break
    print(time.time(), "MAIN received", out_data)

If you cannot modify the child process, unbuffer (from package expect - install with APT or YUM) may help. This is my working parent code without changing the child code.

test_proc = subprocess.Popen(
    ["unbuffer"] + cmd,
    stdout=subprocess.PIPE,
    stderr=subprocess.STDOUT
)

Upvotes: 7

Related Questions