Peter
Peter

Reputation: 395

Scipy's optimization incompatible with Multiprocessing?

In trying to use Scipy's optimization algorithms to minimize a function that computes its value within a sub process, I discovered that gradient-based algorithms (basinhopping and L-BFGS-B so far) encounter the following error on line 562 of optimize.py:

grad[k] = (f(*((xk + d,) + args)) - f0) / d[k]

TypeError: unsupported operand type(s) for -: 'NoneType' and 'NoneType'

Here's a simple example of code that generates this error:

import multiprocessing as mp
from scipy.optimize import basinhopping

def runEnvironment(x):
    return x**2

def func(x):
    if __name__ == '__main__':
        print "x:",x
        pool = mp.Pool(processes=1)

        results=pool.apply(runEnvironment,(x,))
        pool.close()
        return results

x0=5    
ret=basinhopping(func, x0, niter=100, T=1.0, stepsize=0.1, minimizer_kwargs=None, take_step=None, accept_test=None, callback=None, interval=50, disp=False, niter_success=None)

Note that this code runs fine if the multiprocessing components are removed, or if a non-gradient-based algorithm (like COBYLA) is used. Can anyone think of a reason this is happening?

Upvotes: 1

Views: 2216

Answers (2)

unutbu
unutbu

Reputation: 879849

It is inefficient to create many little mp.Pools each with only one worker process. It is also inefficient to create one Pool per call to func, since func is called many times.

Instead, create one Pool at the outset of your program, and pass the pool to each call to func:

if __name__ == '__main__':
    pool = mp.Pool()
    x0=5    
    ret = optimize.basinhopping(
        func, x0, niter=100, T=1.0, stepsize=0.1,
        minimizer_kwargs=dict(args=pool), 
        take_step=None, accept_test=None, callback=None,
        interval=50, disp=False, niter_success=None)
    pool.close()
    print(ret)

The minimizer_kwargs=dict(args=pool) tells optimize.basinhopping to pass pool as an additional argument to func.


You can also use

logger = mp.log_to_stderr(logging.INFO)

to obtain logging statements which show in what process the functions are being called. For example,

import multiprocessing as mp
from scipy import optimize
import logging
logger = mp.log_to_stderr(logging.INFO)

def runEnvironment(x):
    logger.info('runEnvironment({}) called'.format(x))
    return x**2

def func(x, pool):
    logger.info('func({}) called'.format(x))
    results = pool.apply(runEnvironment,(x,))
    return results

if __name__ == '__main__':
    pool = mp.Pool()
    x0=5    
    ret = optimize.basinhopping(
        func, x0, niter=100, T=1.0, stepsize=0.1,
        minimizer_kwargs=dict(args=pool), 
        take_step=None, accept_test=None, callback=None,
        interval=50, disp=False, niter_success=None)
    pool.close()
    print(ret)

prints

[INFO/PoolWorker-1] child process calling self.run()
[INFO/PoolWorker-2] child process calling self.run()
[INFO/PoolWorker-3] child process calling self.run()
[INFO/PoolWorker-4] child process calling self.run()
[INFO/MainProcess] func([ 5.]) called
[INFO/PoolWorker-1] runEnvironment([ 5.]) called
[INFO/MainProcess] func([ 5.00000001]) called
[INFO/PoolWorker-2] runEnvironment([ 5.00000001]) called
[INFO/MainProcess] func([ 5.]) called
[INFO/PoolWorker-3] runEnvironment([ 5.]) called
[INFO/MainProcess] func([-5.]) called
[INFO/PoolWorker-4] runEnvironment([-5.]) called

This shows that func is always called by the main process, while runEnvironment is run by worker processes.

Note that the calls to func happen sequentially. To get any benefit out of pool, you'll have to exercise more processors with each call to func.

Upvotes: 0

Thomite
Thomite

Reputation: 741

Your if __name__ == '__main__': idiom is in the incorrect position - rearranging it this way works:

import multiprocessing as mp
from scipy.optimize import basinhopping

def runEnvironment(x):
    return x**2

def func(x):

    print "x:",x
    pool = mp.Pool(processes=1)

    results=pool.apply(runEnvironment,(x,))
    pool.close()
    return results

if __name__ == '__main__':
    x0=5
    ret=basinhopping(func, x0, niter=100, T=1.0, stepsize=0.1, minimizer_kwargs=None, take_step=None, accept_test=None, callback=None, interval=50, disp=False, niter_success=None)

Upvotes: 1

Related Questions