Reputation: 41509
I'm trying to feed some data to a chain of processes connected through pipes. However, I can't manage it to close.
p1 = subprocess.Popen("sort", stdin=subprocess.PIPE, stdout=subprocess.PIPE)
p2 = subprocess.Popen("uniq", stdin=p1.stdout, stdout=subprocess.PIPE)
p1.communicate(r"""
a
b
c
a""")
out, _ = p2.communicate()
print(out)
The program now just sits waiting. Is there another way I should signal p1
that the input ends?
-- note: I'm running on windows
Upvotes: 1
Views: 1096
Reputation: 90
You need to close stdin on the first program.
Some things to take notice of:
There are buffers between pipes (what subprocess.PIPE creates for you), which vary in size per platform and usage. Don't worry about this just yet though as it's not as relevant as:
In this case specifically, sort
requires having the full input read before being able to sort (you cannot sort things if you don't know what they are yet).
Due to 2
, it has it's own buffer that collects and waits for the file descriptor to be closed, signalling it's completion ;)
Edit: Here's the example I was making. I find it personally cleaner to use a pipe directly, since you can then build up your input separately before spawning a process:
In [2]: import os
...: import subprocess
...:
...: # A raw os level pipe, which consists of two file destriptors
...: # connected to each other, ala a "pipe".
...: # (This is what subprocess.PIPE sets up btw, hence it's name! ;)
...: read, write = os.pipe()
...:
...: # Write what you want to it. In python 2, remove the `b` since all `str`ings are `byte` strings there.
...: os.write(write, b"blahblahblah")
...:
...: # Close stdin to signal completion of input
...: os.close(write)
...:
...: # Spawn process using the pipe as stdin
...: p = subprocess.Popen(['cat'], stdin=read)
...:
blahblahblah
Also, make sure you p.wait()
for the completion of the process, or you can end up in situations where you have not gotten the full result yet.
Upvotes: 2
Reputation: 24251
Disclaimer: Not an expert here. I haven't used communicate()
before, but...
Firstly, when reading the docs for communicate
it's meant to read the data from stdout/stderr of the process you are running:
Read data from stdout and stderr
So I guess the output of the p1
execution of sort
gets read by your python program. Or to be precise, on my Linux machine the behavior seems not deterministic - sometimes it's the Python code and sometimes it's the p2/uniq
which reads the standard output of p1/sort
. I guess they simply compete for the data.
It looks like the communicate()
thing is some kind of combo, which does a bit too much for your use case (with p1/sort
). It's fine with p2/uniq
.
On the other hand, if you tried:
import subprocess
p1 = subprocess.Popen("sort", stdin=subprocess.PIPE, stdout=subprocess.PIPE)
p2 = subprocess.Popen("uniq", stdin=p1.stdout, stdout=subprocess.PIPE)
p1.stdin.write(r"""
a
b
c
a""")
p1.stdin.close()
out, _ = p2.communicate()
print(out)
it seems to work.
Upvotes: 1