drjrm3
drjrm3

Reputation: 4718

Grab output from shell command which is run in the background

I saw some useful information in this post about how you can't expect to run a process in the background if you are retrieving output from it using subprocess. The problem is ... this is exactly what I want to do!

I have a script which drops commands to various hosts via ssh and I don't want to have to wait on each one to finish before starting the next. Ideally, I could have something like this:

for host in hostnames:
  p[host] = Popen(["ssh", mycommand], stdout=PIPE, stderr=PIPE)
  pout[host], perr[host] = p[host].communicate()

which would have (in the case where mycommand takes a very long time) all of the hosts running mycommand at the same time. As it is now, it appears that the entirety of the ssh command finishes before starting the next. This is (according to the previous post I linked) due to the fact that I am capturing output, right? Other than just cating the output to a file and reading the output later, is there a decent way to make these things happen on various hosts in parallel?

Upvotes: 0

Views: 400

Answers (2)

Finwood
Finwood

Reputation: 3981

You could use threads for achieving parallelism and a Queue for retrieving results in a thread-safe way:

import subprocess
import threading
import Queue

def run_remote_async(host, command, result_queue, identifier=None):
    if isinstance(command, str):
        command = [command]

    if identifier is None:
        identifier = "{}: '{}'".format(host, ' '.join(command))

    def worker(worker_command_list, worker_identifier):
        p = subprocess.Popen(worker_command_list,
                stdout=subprocess.PIPE,
                stderr=subprocess.PIPE)
        result_queue.put((worker_identifier, ) + p.communicate())

    t = threading.Thread(target=worker,
            args=(['ssh', host] + command, identifier),
            name=identifier)
    t.daemon = True
    t.start()

    return t

Then, a possible test case could look like this:

def test():
    data = [('host1', ['ls', '-la']),
            ('host2', 'whoami'),
            ('host3', ['echo', '"Foobar"'])]
    q = Queue.Queue()
    for host, command in data:
        run_remote_async(host, command, q)
    for i in range(len(data)):
        identifier, stdout, stderr = q.get()
        print identifier
        print stdout

Queue.get() is blocking, so at this point you can collect one result after another, once the task is completed.

Upvotes: 0

Reut Sharabani
Reut Sharabani

Reputation: 31339

You may want to use fabric for this.

Fabric is a Python (2.5-2.7) library and command-line tool for streamlining the use of SSH for application deployment or systems administration tasks.

Example file:

from fabric.api import run, env

def do_mycommand():
    my_command = "ls" # change to your command
    output = run(mycommand)
    print "Output of %s on %s:%s" % (mycommand, env.host_string, output)

Now to execute on all hosts (host1,host2 ... is where all hosts go):

fab -H host1,host2 ... do_mycommand

Upvotes: 1

Related Questions