Reputation: 33
As a Python beginner, I'm trying to parallelize some sections of a function that serves as an input to an optimization routine. This function f returns the log-likelihood, the gradient and the hessian for a given vector b. In this function there are three independent loop functions: loop_1
, loop_2
, and loop_3
.
What is the most efficient implementation? Parallelizing the three loop functions in three concurrent processes or parallelizing one loop at a time? And how can this be implemented? When using the multiprocessing package I get a 'pickle' error, as my nested loop functions are not in the general namespace.
def f(b):
# Do something computational intensive on b
def calc(i, j):
return u, v, w
def loop_1():
for i in range(1:1000):
c, d, e = calc(i, 0)
for j in range(1:200):
f, g, h = calc(i, j)
return x, y, z
def loop_2():
# similar to loop_1
def loop_3():
# similar to loop_1
# Aggregate results from the three loops
return u, v, w
Upvotes: 0
Views: 1925
Reputation: 424
There are several ways to avoid the pickling error you receive.
An option could be asynchronous, if makes sense to do so. Sometimes it makes it slower, sometimes it makes it slower.
In that case it would look something like the code bellow, I use it as a templet when I forget things:
import asyncio
def f():
async def factorial(n):
f.p = 2
await asyncio.sleep(0.2)
return 1 if n < 2 else n * await factorial(n-1)
async def multiply(n, k):
await asyncio.sleep(0.2)
return sum(n for _ in range(k))
async def power(n, k):
await asyncio.sleep(0.2)
return await multiply(n, await power(n, k-1)) if k != 0 else 1
loop = asyncio.get_event_loop()
tasks = [asyncio.ensure_future(power(2, 5)),
asyncio.ensure_future(factorial(5))]
f.p = 0
ans = tuple(loop.run_until_complete(asyncio.gather(*tasks)))
print(f.p)
return ans
if __name__ == '__main__':
print(f())
Async and await is builtin keywords like def, for, in and such in python3.5.
Another work around with functions in functions is to use threads instead.
from concurrent.futures import ThreadPoolExecutor
import time
def f():
def factorial(n):
f.p = 2
time.sleep(0.2)
return 1 if n < 2 else n*factorial(n-1)
def multiply(n, k):
time.sleep(0.2)
return sum(n for _ in range(k))
def power(n, k):
time.sleep(0.2)
return multiply(n, power(n, k-1)) if k != 0 else 1
def calculate(func, args):
return func(*args)
def calculate_star(args):
return calculate(*args)
pool = ThreadPoolExecutor()
tasks = [(power, (2, 5)), (factorial, (5, ))]
f.p = 0
result = list(pool.map(calculate_star, tasks))
print(f.p)
return result
if __name__ == '__main__':
print(f())
Upvotes: 1
Reputation: 932
You should start your function in a pool of process.
import multiprocessing
pool = multiprocessing.Pool()
for i in range(3):
if i == 0:
pool.apply_async(loop_1)
elif i == 1:
pool.apply_async(loop_2)
if i == 2:
pool.apply_async(loop_3)
pool.close()
If loop_1,loop_2 and loop_3 are the same functions with same operations you simply can call loop_3 three times.
Upvotes: 0