Solve multiple independent optimizations in scipy

Question

I need to minimize a cost function for a large number (1000s) of different inputs. Obviously, this can be implemented by looping over scipy.optimize.minimize or any other minimization routine. Here is an example:

import numpy as np
import scipy as sp

def cost(x, a, b):
    return np.sum((np.sum(a * x.reshape(a.shape), axis=1) - b)**2)

a = np.random.randn(500, 40)
b = np.array(np.arange(500))

x = []
for i in range(a.shape[0]):
    res = sp.optimize.minimize(cost, np.zeros(40), args=(a[None, i], b[None, i]))
    x.append(res.x)

It finds x[i, :] that minimize cost for each a[i, :] and b[i], but this is very slow. I guess looping over minimize causes considerable overhead.

A partial solution is to solve for all x simultaneously:

res = sp.optimize.minimize(cost, np.zeros_like(a), args=(a, b))

This is even slower than the loop. minimize does not know that elements in x are group-wise independent. So it computes the full hessian although a block-diagonal matrix would be sufficient, considering the problem structure. This is slow and overflows my computer's memory.

Is there any way to inform minimize or another optimization function about the problem structure so that it can solve multiple indepentent optimizations in a single function call? (Similar to certain options supported by Matlab's fsolve.)

MB-F · Accepted Answer

First, a solution:

Turns out scipy.optimize.least_squares supports exploiting the structure of the jacobian by setting the jac_sparsity argument.

The least_squares function works slightly different than minimize so the cost function needs to be rewritten to return residuals instead:

def residuals(x, a, b):
    return np.sum(a * x.reshape(a.shape), axis=1) - b

The jacobian has block-diagonal sparsity structure, given by

jacs = sp.sparse.block_diag([np.ones((1, 40), dtype=bool)]*500)

And calling the optimization routine:

res = sp.optimize.least_squares(residuals, np.zeros(500*40),
                                jac_sparsity=jacs, args=(a, b))
x = res.x.reshape(500, 40)

But is it really faster?

%timeit opt1_loopy_min(a, b)        # 1 loop, best of 3: 2.43 s per loop
%timeit opt2_loopy_min_start(a, b)  # 1 loop, best of 3: 2.55 s per loop
%timeit opt3_loopy_lsq(a, b)        # 1 loop, best of 3: 13.7 s per loop
%timeit opt4_dense_lsq(a, b)        # ValueError: array is too big; ...
%timeit opt5_jacs_lsq(a, b)         # 1 loop, best of 3: 1.04 s per loop

Conclusions:

There is no obvious difference between the original solution (opt1) and re-use of the starting point (opt2) without sorting.
looping over least_squares (opt3) is considerable slower than looping over minimize (opt1, opt2).
The problem is too big to naiively run with least_squares because the jacobian matrix does not fit in memory.
Exploiting the sparsity structure of the jacobian in least_squares (opt5) seems to be the fastest approach.

This is the timing test environment:

import numpy as np
import scipy as sp

def cost(x, a, b):
    return np.sum((np.sum(a * x.reshape(a.shape), axis=1) - b)**2)

def residuals(x, a, b):
    return np.sum(a * x.reshape(a.shape), axis=1) - b

a = np.random.randn(500, 40)
b = np.arange(500)

def opt1_loopy_min(a, b):
    x = []
    x0 = np.zeros(a.shape[1])
    for i in range(a.shape[0]):
        res = sp.optimize.minimize(cost, x0, args=(a[None, i], b[None, i]))
        x.append(res.x)
    return np.stack(x)

def opt2_loopy_min_start(a, b):
    x = []
    x0 = np.zeros(a.shape[1])
    for i in range(a.shape[0]):
        res = sp.optimize.minimize(cost, x0, args=(a[None, i], b[None, i]))
        x.append(res.x)
        x0 = res.x
    return np.stack(x)

def opt3_loopy_lsq(a, b):
    x = []
    x0 = np.zeros(a.shape[1])
    for i in range(a.shape[0]):
        res = sp.optimize.least_squares(residuals, x0, args=(a[None, i], b[None, i]))
        x.append(res.x)
    return x

def opt4_dense_lsq(a, b):
    res = sp.optimize.least_squares(residuals, np.zeros(a.size), args=(a, b))
    return res.x.reshape(a.shape)

def opt5_jacs_lsq(a, b):
    jacs = sp.sparse.block_diag([np.ones((1, a.shape[1]), dtype=bool)]*a.shape[0])
    res = sp.optimize.least_squares(residuals, np.zeros(a.size), jac_sparsity=jacs, args=(a, b))
    return res.x.reshape(a.shape)

Solve multiple independent optimizations in scipy

Answers (2)

Related Questions