Alex Reynolds
Alex Reynolds

Reputation: 96937

Why does shell=True work when piping commands together?

I have a couple subprocess instances I'd like to string together into a pipeline, but I am stuck and would like to ask for advice.

For example, to mimic:

cat data | foo - | bar - > result

Or:

foo - < data | bar - > result

...I first tried the following, which hangs:

import subprocess, sys

firstProcess = subprocess.Popen(['foo', '-'], stdin=subprocess.PIPE,
                                stdout=subprocess.PIPE)
secondProcess = subprocess.Popen(['bar', '-'], stdin=firstProcess.stdout,
                                 stdout=sys.stdout)

for line in sys.stdin:
    firstProcess.stdin.write(line)
    firstProcess.stdin.flush()

firstProcess.stdin.close()
firstProcess.wait()

My second attempt uses one subprocess instance with the shell=True parameter, which works:

import subprocess, sys

pipedProcess = subprocess.Popen(" ".join(['foo', '-', '|', 'bar', '-']),
                                stdin=subprocess.PIPE, shell=True)

for line in sys.stdin:
    pipedProcess.stdin.write(line)
    pipedProcess.stdin.flush()

pipedProcess.stdin.close()
pipedProcess.wait()

What am I doing wrong with the first, chained subprocess approach? I read that it is best not to use shell=True and I'm curious what I'm doing wrong with the first approach. Thanks for your advice.

EDIT

I fixed a typo in my question and fixed the stdin parameter of secondProcess. It still hangs.

I also tried removing firstProcess.wait() which resolves the hang, but then I get a 0-byte file as result.

I'll stick with the pipedProcess, since it works fine. But if anyone knows why the first setup hangs or makes a 0-byte file as output, I'd be interested to know why as well.

Upvotes: 2

Views: 185

Answers (2)

jfs
jfs

Reputation: 414139

To fix the first example, add foo_process.stdout.close() as the docs suggest. The following code emulates foo - | bar - command:

#!/usr/bin/python
from subprocess import Popen, PIPE

foo_process = Popen(['foo', '-'], stdout=PIPE)
bar_process = Popen(['bar', '-'], stdin=foo_process.stdout)
foo_process.stdout.close() # allow foo to know if bar ends
bar_process.communicate()  # equivalent to bar_process.wait() in this case  

You don't need to use sys.stdin, sys.stdout explicitly here unless their different from sys.__stdin__, sys.__stdout__.

To emulate foo - < data | bar - > result command:

#!/usr/bin/python
from subprocess import Popen, PIPE

with open('data','rb') as input_file, open('result', 'wb') as output_file:
    foo = Popen(['foo', '-'], stdin=input_file, stdout=PIPE)
    bar = Popen(['bar', '-'], stdin=foo.stdout, stdout=output_file)
    foo.stdout.close() # allow foo to know if bar ends
bar.wait()

If you want to feed modified input line-by-line to the foo process i.e., to emulate python modify_input.py | foo - | bar - command:

#!/usr/bin/python
import sys
from subprocess import Popen, PIPE

foo_process = Popen(['foo', '-'], stdin=PIPE, stdout=PIPE)
bar_process = Popen(['bar', '-'], stdin=foo_process.stdout)
foo_process.stdout.close() # allow foo to know if bar ends
for line in sys.stdin:
    print >>foo_process.stdin, "PY", line, # modify input, feed it to `foo`
foo_process.stdin.close() # tell foo there is no more input
bar_process.wait()

Upvotes: 1

Mattie B
Mattie B

Reputation: 21269

shell=True works because you're asking the shell to interpret your entire command line and handle the piping itself. It is effectively as if you typed foo - | bar - directly into the shell.

(This is also why it can be unsafe to use shell=True; there are many ways to fool the shell into doing bad things that won't happen if you directly pass the command and arguments in as a list that isn't subject to parsing by any intermediaries.)

Upvotes: 2

Related Questions