Seb
Seb

Reputation: 33

Parallelize nested functions in Python

As a Python beginner, I'm trying to parallelize some sections of a function that serves as an input to an optimization routine. This function f returns the log-likelihood, the gradient and the hessian for a given vector b. In this function there are three independent loop functions: loop_1, loop_2, and loop_3.

What is the most efficient implementation? Parallelizing the three loop functions in three concurrent processes or parallelizing one loop at a time? And how can this be implemented? When using the multiprocessing package I get a 'pickle' error, as my nested loop functions are not in the general namespace.

def f(b):

  # Do something computational intensive on b

  def calc(i, j):
    return u, v, w

  def loop_1():
    for i in range(1:1000):
      c, d, e = calc(i, 0)

      for j in range(1:200):
        f, g, h = calc(i, j)

    return x, y, z

  def loop_2():
    # similar to loop_1

  def loop_3():
    # similar to loop_1

  # Aggregate results from the three loops

  return u, v, w

Upvotes: 0

Views: 1925

Answers (2)

Simon
Simon

Reputation: 424

There are several ways to avoid the pickling error you receive.

An option could be asynchronous, if makes sense to do so. Sometimes it makes it slower, sometimes it makes it slower.

In that case it would look something like the code bellow, I use it as a templet when I forget things:

import asyncio


def f():
    async def factorial(n):
        f.p = 2
        await asyncio.sleep(0.2)
        return 1 if n < 2 else n * await factorial(n-1)

    async def multiply(n, k):
        await asyncio.sleep(0.2)
        return sum(n for _ in range(k))

    async def power(n, k):
        await asyncio.sleep(0.2)
        return await multiply(n, await power(n, k-1)) if k != 0 else 1

    loop = asyncio.get_event_loop()
    tasks = [asyncio.ensure_future(power(2, 5)),
             asyncio.ensure_future(factorial(5))]
    f.p = 0
    ans = tuple(loop.run_until_complete(asyncio.gather(*tasks)))
    print(f.p)
    return ans

if __name__ == '__main__':
    print(f())

Async and await is builtin keywords like def, for, in and such in python3.5.

Another work around with functions in functions is to use threads instead.

from concurrent.futures import ThreadPoolExecutor
import time

def f():
    def factorial(n):
        f.p = 2
        time.sleep(0.2)
        return 1 if n < 2 else n*factorial(n-1)

    def multiply(n, k):
        time.sleep(0.2)
        return sum(n for _ in range(k))

    def power(n, k):
        time.sleep(0.2)
        return multiply(n, power(n, k-1)) if k != 0 else 1

    def calculate(func, args):
        return func(*args)

    def calculate_star(args):
        return calculate(*args)

    pool = ThreadPoolExecutor()
    tasks = [(power, (2, 5)), (factorial, (5, ))]
    f.p = 0
    result = list(pool.map(calculate_star, tasks))
    print(f.p)
    return result

if __name__ == '__main__':
    print(f())

Upvotes: 1

nrhode
nrhode

Reputation: 932

You should start your function in a pool of process.

 import multiprocessing

 pool = multiprocessing.Pool()
 for i in range(3):
     if i == 0:
         pool.apply_async(loop_1)
     elif i == 1:
         pool.apply_async(loop_2)
     if i == 2:
         pool.apply_async(loop_3)
 pool.close()

If loop_1,loop_2 and loop_3 are the same functions with same operations you simply can call loop_3 three times.

Upvotes: 0

Related Questions