Amit Goel
Amit Goel

Reputation: 35

how to run multiple process in parallel in python

the heading is very generic but issue might not be.

I have a script that is compiling some code with the parameters passed from a file(xls file). Based on number of configurations on xls i have to compile certain files. I want to store result of each compilation(stdout and stderr) in text files whose names comes from configuration.

I have been able to do all this but to speed up things i want to run all the compilation in parallel. Is there a way to do this?

Sample file..

for n in num_rows: # num_rows store all the rows read using xlrd object
    parameters_list = [...] # has all the parameters read from xls
    .
    .
    .
    logfile = ...txt #name is based on name read from xls

    p = subprocess.Popen(parameters_list, stderr=logfile)
    p.wait()
    logfile.close()

I have to wait for each process to be over before closing the file.

My problem might be too long but any help or leads are welcomed.

Upvotes: 2

Views: 151

Answers (2)

pppery
pppery

Reputation: 3814

You can do this using a multiprocessing.Pool:

def parse_row(n):
    parameters_list = [...] # has all the parameters read from xls
    .
    .
    .
    logfile = ...txt #name is based on name read from xls
    p = subprocess.Popen(parameters_list, stderr=logfile)
    p.wait()
    logfile.close()
pool = multiprocessing.Pool()
pool.map_async(parse_row, num_rows)
pool.close()
pool.join()

Upvotes: 2

rmunn
rmunn

Reputation: 36718

Assuming your processes will all be writing to different logfiles, the answer is quite simple: the subprocess module will already run things in parallel. Just create a different Popen object for each one, and store them in a list:

processes = []
logfiles = []
for n in num_rows: # num_rows store all the rows read using xlrd object
    parameters_list = [...] # has all the parameters read from xls
    .
    .
    .
    logfile = ...txt #name is based on name read from xls
    logfiles.append(logfile)

    p = subprocess.Popen(parameters_list, stderr=logfile)
    logfiles.append(logfile)
    processes.append(p)

# Now, outside the for loop, the processes are all running in parallel.
# Now we can just wait for each of them to finish, and close its corresponding logfile

for p, logfile in zip(processes, logfiles):
    p.wait() # This will return instantly if that process was already finished
    logfile.close()

Upvotes: 1

Related Questions