mysticvisionnnn
mysticvisionnnn

Reputation: 125

passing arguments and manager.dict to pool in multiprocessing in python 2.7

I want to parallelise a function that will update a shared dictionary using Pool instead of Process so that i don't over-allocate too many cpus.

i.e. can i take this

def my_function(bar,results):
    results[bar] = bar*10

def paralell_XL():

    from multiprocessing import Pool, Manager, Process

    manager = Manager()
    results=manager.dict()

    jobs = []
    for bar in foo:
        p=Process(target=my_function, args=(bar, results))
        jobs.append(p)
        p.start()

    for proc in jobs:
        proc.join()

and change the paralell_XL() function to something like this ?

def paralell_XL():

    from multiprocessing import Pool, Manager, Process

    manager = Manager()
    results=manager.dict()

    p = Pool(processes=4)
    p.map(my_function,(foo,results))

trying the above gives the following error

TypeError: unsupported operand type(s) for //: 'int' and 'DictProxy'

thanks

Upvotes: 2

Views: 6077

Answers (1)

mysticvisionnnn
mysticvisionnnn

Reputation: 125

so the problem is with passing many arguments to pool. As demonstrated here Python multiprocessing pool.map for multiple arguments you just need to make it into a tuple and add a wrapper. This works for passing a manager.dict as an argument also.

def my_function(bar,results):
    results[bar] = bar*10

def func_star(a_b):
    """Convert `f([1,2])` to `f(1,2)` call."""
    return my_function(*a_b)

def paralell_XL():

    from multiprocessing import Pool, Manager, Process
    import itertools

    manager = Manager()
    results=manager.dict()

    pool = Pool(processes=4)    
    pool.map(func_star, itertools.izip(foo, itertools.repeat(results)))

(note I think this question + answer is worth keeping as it wasn't fully clear to me that you would be able to pass the manager.dict into the function this way)

Upvotes: 3

Related Questions