Reputation: 125
I want to parallelise a function that will update a shared dictionary using Pool instead of Process so that i don't over-allocate too many cpus.
i.e. can i take this
def my_function(bar,results):
results[bar] = bar*10
def paralell_XL():
from multiprocessing import Pool, Manager, Process
manager = Manager()
results=manager.dict()
jobs = []
for bar in foo:
p=Process(target=my_function, args=(bar, results))
jobs.append(p)
p.start()
for proc in jobs:
proc.join()
and change the paralell_XL() function to something like this ?
def paralell_XL():
from multiprocessing import Pool, Manager, Process
manager = Manager()
results=manager.dict()
p = Pool(processes=4)
p.map(my_function,(foo,results))
trying the above gives the following error
TypeError: unsupported operand type(s) for //: 'int' and 'DictProxy'
thanks
Upvotes: 2
Views: 6077
Reputation: 125
so the problem is with passing many arguments to pool. As demonstrated here Python multiprocessing pool.map for multiple arguments you just need to make it into a tuple and add a wrapper. This works for passing a manager.dict as an argument also.
def my_function(bar,results):
results[bar] = bar*10
def func_star(a_b):
"""Convert `f([1,2])` to `f(1,2)` call."""
return my_function(*a_b)
def paralell_XL():
from multiprocessing import Pool, Manager, Process
import itertools
manager = Manager()
results=manager.dict()
pool = Pool(processes=4)
pool.map(func_star, itertools.izip(foo, itertools.repeat(results)))
(note I think this question + answer is worth keeping as it wasn't fully clear to me that you would be able to pass the manager.dict into the function this way)
Upvotes: 3