Dagang Wei
Dagang Wei

Reputation: 26488

Python: subprocess.call broken pipe

I'm trying to call a shell script in python, but it keeps reporting broken pipe error (the result is OK, but i don't want to see the error message in STDERR). I have pinpointed the cause, and it can be reproduced as the following snippet:

subprocess.call('cat /dev/zero | head -c 10 | base64', shell=True)

AAAAAAAAAAAAAA==

cat: write error: Broken pipe

/dev/zero is an infinite stream, but the head -c 10 only reads 10 bytes from it and exits, then cat will get SIGPIPE because of the peer has closed the pipe. There's no broken pipe error message when i run the command in shell, but why python shows it?

Upvotes: 8

Views: 2609

Answers (2)

subdir
subdir

Reputation: 748

The default action to SIGPIPE signal is to terminate the program. Python interpreter changes it to SIG_IGN to be able to report broken pipe errors to a program in the form of exceptions.

When you execute cat ... |head ... in shell, cat has default SIGPIPE handler, and OS kernel just terminates it on SIGPIPE.

When you execute cat using subprocess it derives SIGPIPE handler from its parent (python interpreter), SIGPIPE is just ignored and cat handles the error itself by checking errno variable and printing error message.

To avoid error messages from cat you can use preexec_fn argument to subprocess.call:

from signal import signal, SIGPIPE, SIG_DFL
subprocess.call(
    'cat /dev/zero | head -c 10 | base64',
    shell = True,
    preexec_fn = lambda: signal(SIGPIPE, SIG_DFL)
)

Upvotes: 9

Chris Morgan
Chris Morgan

Reputation: 90752

In this trivial case at least you're not gaining anything by using shell commands—and you're losing portability and speed.

Python 2 code:

>>> import base64
>>> base64.b64encode(open('/dev/zero', 'rb').read(10))
'AAAAAAAAAAAAAA=='
>>> base64.b64encode('\0' * 10)
'AAAAAAAAAAAAAA=='

In Python 3 (code will also run in 2.6+, though it will return str rather than bytes instances):

>>> import base64
>>> base64.b64encode(open('/dev/zero', 'rb').read(10))
b'AAAAAAAAAAAAAA=='
>>> base64.b64encode(b'\0' * 10)
b'AAAAAAAAAAAAAA=='

In each case, the first example retains the usage of /dev/zero (in itself non-portable, but never mind), the second produces the effect, though I imagine it's not what you want specifically?

Upvotes: 2

Related Questions