mel
mel

Reputation: 2790

Multiprocess a function in python that got multiple parameters

I'm trying to use the multiprocessing library in python but I met some difficulties:

def request_solr(limit=10, offset=10):
    # build my facets here using limit and offset
    # request solr
    return response.json()

def get_list_event_per_user_per_mpm(limit=100):
    nb_unique_user = get_unique_user()
    print "Unique user: ", nb_unique_user
    processor_pool = multiprocessing.Pool(4)
    offset = range(0, nb_unique_user, limit)
    list_event_per_user = processor_pool.map(request_solr(limit), offset)
    return list_event_per_user

I'm not sure how to pass the second parameters into the function. How can I make it work. I've got the following error:

TypeError: 'dict' object is not callable

Upvotes: 0

Views: 931

Answers (3)

Ali SAID OMAR
Ali SAID OMAR

Reputation: 6792

I used to use a generator to produce the keywords. This is the content a my simple_multiproc.py.

Note the important of having request_solr at level module.

import multiprocessing

MAX=5

def _get_pool_args(**kw):
    for _ in range(MAX):
        r = {"limit": 10, "offset": 10}
        r.update(kw)
        yield r


def request_solr(limit=10, offset=10):
    # build my facets here using limit and offset
    # request solr
    print(locals())
    response.json()

if __name__ == "__main__":
    pool = multiprocessing.Pool(MAX)
    pool.map(request_solr, _get_pool_args())

Upvotes: 1

Bakuriu
Bakuriu

Reputation: 101919

you see that error because you are calling the function before passing it to multiprocessing.

I suggest you use starmap in combination with itertools.repeat:

import itertools as it

# rest of your code

processor_pool = multiprocessing.Pool(4)
offset = range(0, nb_unique_user, limit)
list_event_per_user = processor_pool.starmap(request_solr, zip(it.repeat(limit), offset))

Starmap will call your function expanding the pair of values into two arguments. The repeat(limit) simply produces an iterable that has all elements equal to limit.

This can work for any number of arguments:

def my_function(a, b, c, d, e):
    return a+b+c+d+e

pool = Pool()
pool.starmap(my_function, [(1,2,3,4,5)])   # calls my_function(1,2,3,4,5)

Since you are using an old version of python you have to work around this by either modifying your function or using a wrapper function:

def wrapper(arguments):
    return request_solr(*arguments)

# later:

pool.map(wrapper, zip(repeat(limit), offset))

Upvotes: 2

Morgan Thrapp
Morgan Thrapp

Reputation: 9986

You need to use a lambda for this. The way you're doing it right now, it's trying to map the result of request_solr as a function with offset as the argument.

This should do the trick.

processor_pool.map(lambda x: request_solr(limit, x), offset)

Note, this only works in 3.x. In 2.x you need to create a function object. For example:

class RequestSolrCaller:
    def __init__(self, limit)
        self.limit = limit
    def __call__(self, offset)
        return request_solr(self.limit, offset)

processor_pool.map(RequestSolrCaller(limit), offset)

Upvotes: 1

Related Questions