Rob Schneider
Rob Schneider

Reputation: 165

Multiprocessing a loop in Python

I have the following code that I want to speed up (using multiprocessing).

def main(arg1):
    data=[]
    #Calculate new argument arg2
    for i in range(n):
       data.append(function(z,i,arg2))

Where z in a 2D array. My idea was to do it the following way, but I am not sure this will speed up the process.

from multiprocessing import Pool
import itertools

def function_star(a_b_c):
   return function(*a_b_c)

def main(arg1):
   #Calculate new argument arg2
   pool=Pool()
   i=range(n)
   out=pool.map(function_star, i, itertools.repeat(z),itertools.repeat(arg2) )
   pool.close() 

if __name__=="__main__":
  main(arg1)

Is this indeed the most efficient way to speed up the process?

Upvotes: 3

Views: 2975

Answers (1)

hansaplast
hansaplast

Reputation: 11573

If I interpret your code block correctly you want to have function called with always the same z, and arg1 but with i being a range (I am a bit unsure because the code you pasted will not work, as map only takes one iterable and you're giving 3)

If this is the case, then partial solves your issue:

from multiprocessing import Pool
from functools import partial

def function(i, z, arg2):
    print(z, i, arg2)

def main(arg1):
   #Calculate new argument arg2
   pool=Pool()
   i=range(n)

   out=pool.map(partial(function, z=5, arg2=3), i)
   pool.close() 

if __name__=="__main__":
  main(arg1)

note that you need to change the order of arguments in your function so that the changing i parameter is at first position.

If you care about speed, you should add a third argument to map with the chunksize. This makes that a process asks for a chunksize packet from the main process so you have a smaller numer of communications between the main process and the child processes.

Upvotes: 1

Related Questions