Moshe
Moshe

Reputation: 5129

Poll subprocess finished while looping stdout

I'm writing a script that produces output in an unpredictable size, I want to know from inside the loop when the script has finished.

This is the code:

#!/usr/bin/env python3
import subprocess
import shlex

def main():
    cmd = 'bash -c "for i in $(seq 1 15);do echo $i ;sleep 1;done"'
    print(cmd)
    p = subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE,
                         universal_newlines=True)
    for line in p.stdout:
        print(f"file_name: {line.strip()}")
        print(p.poll())

if __name__ == "__main__":
    main()

The p.poll() is always None even in the last iteration, and it makes sense because after echo it sleeps for 1 second before moving to the next iteration and finishes.

Any way of making it work?

Upvotes: 1

Views: 289

Answers (1)

Booboo
Booboo

Reputation: 44013

You have already identified the problem, that is, after the subprocess has put out the last line it will still continue to run for one second and so while the program is in the loop the program will always be seen to be running. Even if you move the call to poll outside the loop you may have to wait a bit to give the subprocess a chance to terminate after outputting its final message (I have reduced the loop size -- life is too short):

#!/usr/bin/env python3
import subprocess
import shlex
import time

def main():
    cmd = 'bash -c "for i in $(seq 1 5);do echo $i; sleep 1; done;"'
    print(cmd)
    p = subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE, universal_newlines=True)
    for line in p.stdout:
        print(f"file_name: {line.strip()}", flush=True)
    print(p.poll())
    time.sleep(.1)
    print(p.poll())

if __name__ == "__main__":
    main()

Prints:

bash -c "for i in $(seq 1 5);do echo $i; sleep 1; done;"
file_name: 1
file_name: 2
file_name: 3
file_name: 4
file_name: 5
None
0

To "get it to work" inside the loop would require special knowledge of what's going on inside the subprocess. Based on the previous piece of code, we would need:

#!/usr/bin/env python3
import subprocess
import shlex
import time

def main():
    cmd = 'bash -c "for i in $(seq 1 5);do echo $i; sleep 1; done;"'
    print(cmd)
    p = subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE, universal_newlines=True)
    for line in p.stdout:
        # has to be greater than the sleep time in the subprocess to give the subprocess a chance to terminate
        print(f"file_name: {line.strip()}", flush=True)
        time.sleep(1.1)
        print(p.poll())

if __name__ == "__main__":
    main()

Prints:

bash -c "for i in $(seq 1 5);do echo $i; sleep 1; done;"
file_name: 1
None
file_name: 2
None
file_name: 3
None
file_name: 4
None
file_name: 5
0

But this is hardly a practical solution. One would have to ask what is the reason for doing this polling; it offers no useful information unless you are willing to include sleep calls following your reads because there will always be some delay following the last write done by the subprocess and its termination, and these sleep calls are generally wasteful. You should just be reading until there is no more output and then do a p.wait() to wait for the subprocess to terminate, but its's your choice:

#!/usr/bin/env python3
import subprocess
import shlex

def main():
    cmd = 'bash -c "for i in $(seq 1 5);do echo $i; sleep 1; done;"'
    print(cmd)
    p = subprocess.Popen(shlex.split(cmd), stdout=subprocess.PIPE, universal_newlines=True)
    for line in p.stdout:
        print(f"file_name: {line.strip()}", flush=True)
    p.wait()

if __name__ == "__main__":
    main()

Upvotes: 2

Related Questions