Reputation: 2738
I want to process a large for loop in parallel, and from what I have read the best way to do this is to use the multiprocessing library that comes standard with Python.
I have a list of around 40,000 objects, and I want to process them in parallel in a separate class. The reason for doing this in a separate class is mainly because of what I read here.
In one class I have all the objects in a list and via the multiprocessing.Pool and Pool.map functions I want to carry out parallel computations for each object by making it go through another class and return a value.
# ... some class that generates the list_objects
pool = multiprocessing.Pool(4)
results = pool.map(Parallel, self.list_objects)
And then I have a class which I want to process each object passed by the pool.map function:
class Parallel(object):
def __init__(self, args):
self.some_variable = args[0]
self.some_other_variable = args[1]
self.yet_another_variable = args[2]
self.result = None
def __call__(self):
self.result = self.calculate(self.some_variable)
The reason I have a call method is due to the post I linked before, yet I'm not sure I'm using it correctly as it seems to have no effect. I'm not getting the self.result value to be generated.
Any suggestions? Thanks!
Upvotes: 1
Views: 2625
Reputation: 879341
Use a plain function, not a class, when possible. Use a class only when there is a clear advantage to doing so.
If you really need to use a class, then given your setup, pass an instance of Parallel:
results = pool.map(Parallel(args), self.list_objects)
Since the instance has a __call__
method, the instance itself is callable, like a function.
By the way, the __call__
needs to accept an additional argument:
def __call__(self, val):
since pool.map
is essentially going to call in parallel
p = Parallel(args)
result = []
for val in self.list_objects:
result.append(p(val))
Upvotes: 3
Reputation: 23322
Pool.map
simply applies a function (actually, a callable) in parallel. It has no notion of objects or classes. Since you pass it a class, it simply calls __init__
- __call__
is never executed. You need to either call it explicitly from __init__
or use pool.map(Parallel.__call__, preinitialized_objects)
Upvotes: 2