xtofl
xtofl

Reputation: 41509

Why does my Popen pipe block?

I'm trying to feed some data to a chain of processes connected through pipes. However, I can't manage it to close.

p1 = subprocess.Popen("sort", stdin=subprocess.PIPE, stdout=subprocess.PIPE)
p2 = subprocess.Popen("uniq", stdin=p1.stdout, stdout=subprocess.PIPE)

p1.communicate(r"""
a
b
c
a""")
out, _ = p2.communicate()
print(out)

The program now just sits waiting. Is there another way I should signal p1 that the input ends?

-- note: I'm running on windows

Upvotes: 1

Views: 1096

Answers (2)

trevorj
trevorj

Reputation: 90

You need to close stdin on the first program.

Some things to take notice of:

  1. There are buffers between pipes (what subprocess.PIPE creates for you), which vary in size per platform and usage. Don't worry about this just yet though as it's not as relevant as:

  2. In this case specifically, sort requires having the full input read before being able to sort (you cannot sort things if you don't know what they are yet).

Due to 2, it has it's own buffer that collects and waits for the file descriptor to be closed, signalling it's completion ;)

Edit: Here's the example I was making. I find it personally cleaner to use a pipe directly, since you can then build up your input separately before spawning a process:

In [2]: import os
   ...: import subprocess
   ...: 
   ...: # A raw os level pipe, which consists of two file destriptors
   ...: # connected to each other, ala a "pipe".
   ...: # (This is what subprocess.PIPE sets up btw, hence it's name! ;)
   ...: read, write = os.pipe()
   ...: 
   ...: # Write what you want to it. In python 2, remove the `b` since all `str`ings are `byte` strings there.
   ...: os.write(write, b"blahblahblah")
   ...: 
   ...: # Close stdin to signal completion of input
   ...: os.close(write)
   ...: 
   ...: # Spawn process using the pipe as stdin
   ...: p = subprocess.Popen(['cat'], stdin=read)
   ...: 
blahblahblah

Also, make sure you p.wait() for the completion of the process, or you can end up in situations where you have not gotten the full result yet.

Upvotes: 2

Grzegorz Oledzki
Grzegorz Oledzki

Reputation: 24251

Disclaimer: Not an expert here. I haven't used communicate() before, but...

Firstly, when reading the docs for communicate it's meant to read the data from stdout/stderr of the process you are running:

Read data from stdout and stderr

So I guess the output of the p1 execution of sort gets read by your python program. Or to be precise, on my Linux machine the behavior seems not deterministic - sometimes it's the Python code and sometimes it's the p2/uniq which reads the standard output of p1/sort. I guess they simply compete for the data.

It looks like the communicate() thing is some kind of combo, which does a bit too much for your use case (with p1/sort). It's fine with p2/uniq.


On the other hand, if you tried:

import subprocess

p1 = subprocess.Popen("sort", stdin=subprocess.PIPE, stdout=subprocess.PIPE)
p2 = subprocess.Popen("uniq", stdin=p1.stdout, stdout=subprocess.PIPE)

p1.stdin.write(r"""
a
b
c
a""")
p1.stdin.close()
out, _ = p2.communicate()
print(out)

it seems to work.

Upvotes: 1

Related Questions