Reputation: 2790
I'm trying to use the multiprocessing library in python but I met some difficulties:
def request_solr(limit=10, offset=10):
# build my facets here using limit and offset
# request solr
return response.json()
def get_list_event_per_user_per_mpm(limit=100):
nb_unique_user = get_unique_user()
print "Unique user: ", nb_unique_user
processor_pool = multiprocessing.Pool(4)
offset = range(0, nb_unique_user, limit)
list_event_per_user = processor_pool.map(request_solr(limit), offset)
return list_event_per_user
I'm not sure how to pass the second parameters into the function. How can I make it work. I've got the following error:
TypeError: 'dict' object is not callable
Upvotes: 0
Views: 931
Reputation: 6792
I used to use a generator to produce the keywords. This is the content a my simple_multiproc.py.
Note the important of having request_solr at level module.
import multiprocessing
MAX=5
def _get_pool_args(**kw):
for _ in range(MAX):
r = {"limit": 10, "offset": 10}
r.update(kw)
yield r
def request_solr(limit=10, offset=10):
# build my facets here using limit and offset
# request solr
print(locals())
response.json()
if __name__ == "__main__":
pool = multiprocessing.Pool(MAX)
pool.map(request_solr, _get_pool_args())
Upvotes: 1
Reputation: 101919
you see that error because you are calling the function before passing it to multiprocessing.
I suggest you use starmap
in combination with itertools.repeat
:
import itertools as it
# rest of your code
processor_pool = multiprocessing.Pool(4)
offset = range(0, nb_unique_user, limit)
list_event_per_user = processor_pool.starmap(request_solr, zip(it.repeat(limit), offset))
Starmap will call your function expanding the pair of values into two arguments. The repeat(limit)
simply produces an iterable that has all elements equal to limit
.
This can work for any number of arguments:
def my_function(a, b, c, d, e):
return a+b+c+d+e
pool = Pool()
pool.starmap(my_function, [(1,2,3,4,5)]) # calls my_function(1,2,3,4,5)
Since you are using an old version of python you have to work around this by either modifying your function or using a wrapper function:
def wrapper(arguments):
return request_solr(*arguments)
# later:
pool.map(wrapper, zip(repeat(limit), offset))
Upvotes: 2
Reputation: 9986
You need to use a lambda for this. The way you're doing it right now, it's trying to map the result of request_solr
as a function with offset
as the argument.
This should do the trick.
processor_pool.map(lambda x: request_solr(limit, x), offset)
Note, this only works in 3.x. In 2.x you need to create a function object. For example:
class RequestSolrCaller:
def __init__(self, limit)
self.limit = limit
def __call__(self, offset)
return request_solr(self.limit, offset)
processor_pool.map(RequestSolrCaller(limit), offset)
Upvotes: 1