aceminer
aceminer

Reputation: 4295

writing large amount of data to stdin

I am writing a large amount of data to stdin.

How do i ensure that it is not blocking?

p=subprocess.Popen([path],stdout=subprocess.PIPE,stdin=subprocess.PIPE)
p.stdin.write('A very very very large amount of data')
p.stdin.flush()
output = p.stdout.readline()

It seems to hang at p.stdin.write() after i read a large string and write to it.

I have a large corpus of files which will be written to stdin sequentially(>1k files)

So what happens is that i am running a loop

#this loop is repeated for all the files
for stri in lines:
p=subprocess.Popen([path],stdout=subprocess.PIPE,stdin=subprocess.PIPE)
p.stdin.write(stri)
output = p.stdout.readline()
#do some processing

It somehow hangs at file no. 400. The file is a large file with long strings.

I do suspect its a blocking issue.

This only happens if i iterate from 0 to 1000. However, if i were to start from file 400, the error would not happen

Upvotes: 3

Views: 3048

Answers (1)

jfs
jfs

Reputation: 414207

To avoid the deadlock in a portable way, write to the child in a separate thread:

#!/usr/bin/env python
from subprocess import Popen, PIPE
from threading import Thread

def pump_input(pipe, lines):
    with pipe:
        for line in lines:
            pipe.write(line)

p = Popen(path, stdin=PIPE, stdout=PIPE, bufsize=1)
Thread(target=pump_input, args=[p.stdin, lines]).start()
with p.stdout:
    for line in iter(p.stdout.readline, b''): # read output
        print line,
p.wait()

See Python: read streaming input from subprocess.communicate()

Upvotes: 4

Related Questions