WAF
WAF

Reputation: 1151

Parallel independent processes

I want to launch several processes in a loop but as they're quite long to complete I thought it could be better to run them in parallel. All these processes are independent, i.e. they do not depend on the result of each other. Here is a small example that illustrates the type of loop I am dealing with:

inDir = '/path/to/your/dir/'
inTxtList = ['a.txt','b.txt','c.txt','d.txt','e.txt']
for i in inTxtList:
    myfile = open(i,'w')
    myfile.write("This is a text file written in python\n")
    myfile.close()

I've tried the multiprocessing package and come up with the following code:

import multiprocessing

def worker(num):
    """thread worker function"""
    myfile = open(num,'w')
    myfile.write("This is my first text file written in python\n")
    myfile.close()
    return

if __name__ == '__main__':
    jobs = []
    for i in inTxtList:
        p = multiprocessing.Process(target=worker, args=(inDir+i,))
        jobs.append(p)
        p.start()
        p.join()

It is actually working but I don't know how to set the number of workers. Could you help me with that?

Upvotes: 2

Views: 1737

Answers (1)

falsetru
falsetru

Reputation: 368924

Use multiprocessing.Pool.map. You can specify the number of workers by specifying processes argument when you create Pool object:

import os
import multiprocessing

def worker(num):
    with open(num, 'w') as f:
        f.write("This is my first text file written in python\n")

if __name__ == '__main__':
    number_of_workers = 4
    pool = multiprocessing.Pool(processes=number_of_workers)
    pool.map(worker, [os.path.join(inDir, i) for i in inTxtList])
    pool.close()
    pool.join()

BTW, use os.path.join instead of manually concatenate path components.

Upvotes: 2

Related Questions