user2847238
user2847238

Reputation: 169

Start multiple processes in python without waiting

I have trouble starting multiple processes simultanousely without waiting for termination.

I am iterating through directory and then process content of file in external script.

Command line execution looks like following:

 python process.py < /dir/file

Here is some sample of python code

for root, directory, file in os.walk(dir):
    for name in file:
        input_file = open(os.path.join(root, name))
        input_text = input_file.read().encode('utf-8')
        input_file.close()

        command = "python process.py"
        process = subprocess.Popen(command.split(), shell=False, stdin=subprocess.PIPE)
        process.stdin.write(input_text)
        log.debug("Process started with pid {0}".format(process.pid))
        process.communicate()

Is there any way to start them without waiting for termination?

Upvotes: 0

Views: 2126

Answers (1)

Reut Sharabani
Reut Sharabani

Reputation: 31339

Yes. Store them in a list and don't use process.communicate() in the loop. It blocks.

From the docs:

Interact with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached. Wait for process to terminate. The optional input argument should be a string to be sent to the child process, or None, if no data should be sent to the child.

So result should be something like:

# list to store processes after creating them
prcoesses = list()

for root, directory, file in os.walk(dir):
    for name in file:
        input_file = open(os.path.join(root, name))
        input_text = input_file.read().encode('utf-8')
        input_file.close()

        command = "python process.py"
        process = subprocess.Popen(command.split(),
                                   shell=False,
                                   stdin=subprocess.PIPE)
        processes.append(process)

        process.stdin.write(input_text)
        log.debug("Process started with pid {0}".format(process.pid))
        # process.communicate()

 # wait for processes to complete
 for process in processes:
     stdoutdata, stderrdata = process.communicate()
     # ... do something with data returned from process

To have limited number of processes you may want to use a process pool which is available through the multiprocessing module.

Upvotes: 3

Related Questions