Matthew Green
Matthew Green

Reputation: 10401

Python subprocess polling not giving return code when used with Java process

I'm having a problem with subprocess poll not returning the return code when the process has finished.

I found out how to set a timeout on subprocess.Popen and used that as the basis for my code. However, I have a call that uses Java that doesn't correctly report the return code so each call "times out" even though it is actually finished. I know the process has finished because when removing the poll timeout check, the call runs without issue returning a good exit code and within the time limit.

Here is the code I am testing with.

import subprocess
import time


def execute(command):
    print('start command: {}'.format(command))
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

    print('wait')
    wait = 10
    while process.poll() is None and wait > 0:
        time.sleep(1)
        wait -= 1
    print('done')

    if wait == 0:
        print('terminate')
        process.terminate()

    print('communicate')
    stdout, stderr = process.communicate()
    print('rc')
    exit_code = process.returncode
    if exit_code != 0:
        print('got bad rc')

if __name__ == '__main__':
    execute(['ping','-n','15','127.0.0.1']) # correctly times out
    execute(['ping','-n','5','127.0.0.1']) # correctly runs within the time limit

    # incorrectly times out
    execute(['C:\\dev\\jdk8\\bin\\java.exe', '-jar', 'JMXQuery-0.1.8.jar', '-url', 'service:jmx:rmi:///jndi/rmi://localhost:18080/jmxrmi', '-json', '-q', 'java.lang:type=Runtime;java.lang:type=OperatingSystem'])

You can see that two examples are designed to time out and two are not to time out and they all work correctly. However, the final one (using jmxquery to get tomcat metrics) doesn't return the exit code and therefore "times out" and has to be terminated, which then causes it to return an error code of 1.

Is there something I am missing in the way subprocess poll is interacting with this Java process that is causing it to not return an exit code? Is there a way to get a timeout option to work with this?

Upvotes: 1

Views: 1375

Answers (2)

Matthew Green
Matthew Green

Reputation: 10401

Using the other answer by @DavisHerring as the basis for more research, I came across a concept that worked for my original case. Here is the code that came out of that.

import subprocess
import threading
import time


def execute(command):
    print('start command: {}'.format(command))
    process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    timer = threading.Timer(10, terminate_process, [process])
    timer.start()

    print('communicate')
    stdout, stderr = process.communicate()
    print('rc')
    exit_code = process.returncode
    timer.cancel()

    if exit_code != 0:
        print('got bad rc')

def terminate_process(p):
    try:
        p.terminate()
    except OSError:
        pass # ignore error

It uses the threading.Timer to make sure that the process doesn't go over the time limit and terminates the process if it does. It otherwise waits for a response back and cancels the timer once it finishes.

Upvotes: 0

Davis Herring
Davis Herring

Reputation: 39778

This has the same cause as a number of existing questions, but the desire to impose a timeout requires a different answer.

The OS deliberately gives only a small amount of buffer space to each pipe. When a process writes to one that is full (because the reader has not yet consumed the previous output), it blocks. (The reason is that a producer that is faster than its consumer would otherwise be able to quickly use a great deal of memory for no gain.) Therefore, if you want to do more than one of the following with a subprocess, you have to interleave them rather than doing each in turn:

  1. Read from standard output
  2. Read from standard error (unless it’s merged via subprocess.STDOUT)
  3. Wait for the process to exit, or for a timeout to elapse

Of course, the subprocess might close its streams before it exits, write useful output after you notice the timeout and before you kill it, and/or start additional processes that keep the pipe open indefinitely, so you might want to have multiple timeouts. Probably what’s most informative is the EOF on the pipe, so repeatedly use something like select to wait for (however much is left of) the timeout, issue single reads on the streams that are ready, and wait (with another timeout if you’re concerned about hangs after an early stream closure) on EOF. If the timeout occurs instead, (try to) kill the subprocess, and consider issuing non-blocking reads (or another timeout loop) to get any last available output before closing the pipes.

Upvotes: 1

Related Questions