Reputation: 169
I have trouble starting multiple processes simultanousely without waiting for termination.
I am iterating through directory and then process content of file in external script.
Command line execution looks like following:
python process.py < /dir/file
Here is some sample of python code
for root, directory, file in os.walk(dir):
for name in file:
input_file = open(os.path.join(root, name))
input_text = input_file.read().encode('utf-8')
input_file.close()
command = "python process.py"
process = subprocess.Popen(command.split(), shell=False, stdin=subprocess.PIPE)
process.stdin.write(input_text)
log.debug("Process started with pid {0}".format(process.pid))
process.communicate()
Is there any way to start them without waiting for termination?
Upvotes: 0
Views: 2126
Reputation: 31339
Yes. Store them in a list and don't use process.communicate()
in the loop. It blocks.
From the docs:
Interact with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached. Wait for process to terminate. The optional input argument should be a string to be sent to the child process, or None, if no data should be sent to the child.
So result should be something like:
# list to store processes after creating them
prcoesses = list()
for root, directory, file in os.walk(dir):
for name in file:
input_file = open(os.path.join(root, name))
input_text = input_file.read().encode('utf-8')
input_file.close()
command = "python process.py"
process = subprocess.Popen(command.split(),
shell=False,
stdin=subprocess.PIPE)
processes.append(process)
process.stdin.write(input_text)
log.debug("Process started with pid {0}".format(process.pid))
# process.communicate()
# wait for processes to complete
for process in processes:
stdoutdata, stderrdata = process.communicate()
# ... do something with data returned from process
To have limited number of processes you may want to use a process pool which is available through the multiprocessing
module.
Upvotes: 3