Reputation: 1151
I want to launch several processes in a loop but as they're quite long to complete I thought it could be better to run them in parallel. All these processes are independent, i.e. they do not depend on the result of each other. Here is a small example that illustrates the type of loop I am dealing with:
inDir = '/path/to/your/dir/'
inTxtList = ['a.txt','b.txt','c.txt','d.txt','e.txt']
for i in inTxtList:
myfile = open(i,'w')
myfile.write("This is a text file written in python\n")
myfile.close()
I've tried the multiprocessing
package and come up with the following code:
import multiprocessing
def worker(num):
"""thread worker function"""
myfile = open(num,'w')
myfile.write("This is my first text file written in python\n")
myfile.close()
return
if __name__ == '__main__':
jobs = []
for i in inTxtList:
p = multiprocessing.Process(target=worker, args=(inDir+i,))
jobs.append(p)
p.start()
p.join()
It is actually working but I don't know how to set the number of workers. Could you help me with that?
Upvotes: 2
Views: 1737
Reputation: 368924
Use multiprocessing.Pool.map
. You can specify the number of workers by specifying processes
argument when you create Pool
object:
import os
import multiprocessing
def worker(num):
with open(num, 'w') as f:
f.write("This is my first text file written in python\n")
if __name__ == '__main__':
number_of_workers = 4
pool = multiprocessing.Pool(processes=number_of_workers)
pool.map(worker, [os.path.join(inDir, i) for i in inTxtList])
pool.close()
pool.join()
BTW, use os.path.join
instead of manually concatenate path components.
Upvotes: 2