Noiralef
Noiralef

Reputation: 323

Scipy: Minimize violates given bounds

I am trying to use scipy.optimize.minimize with simple a <= x <= b bounds. However, it often happens that my target function is evaluated just outside the bounds. To my understanding, this happens when minimize tries to determine the gradient of the target function at the boundary.


Minimal example:

import math
import numpy as np
from scipy.optimize import Bounds, minimize

constraint = Bounds([-1, -1], [1, 1], True)
def fun(x):
    print(x)
    return -math.exp(-np.dot(x,x))
result = minimize(fun, [-1, -1], bounds=constraint)

The output shows that the minimizer jumps to the point [1,1] and then tries to evaluate at [1.00000001, 1]:

[-1. -1.]
[-0.99999999 -1.        ]
[-1.         -0.99999999]
[-0.72932943 -0.72932943]
[-0.72932942 -0.72932943]
[-0.72932943 -0.72932942]
[-0.22590689 -0.22590689]
[-0.22590688 -0.22590689]
[-0.22590689 -0.22590688]
[1. 1.]
[1.00000001 1.        ]
[1.         1.00000001]
[-0.03437328 -0.03437328]
...

Of course, there is no problem in this example, as fun can be evaluated also there. But that might not always be the case...


In my actual problem, the minimum can not be on the boundary and I have the easy workaround of adding an epsilon to the bounds. But one would expect that there should be an easy solution to this issue which also works if the minimum can be at a boundary?

PS: It would be strange if I were the first to have this problem -- sorry if this question has been asked before somewhere, but I didn't find it anywhere.

Upvotes: 9

Views: 7215

Answers (1)

Noiralef
Noiralef

Reputation: 323

As discussed here (thanks @"Welcome to Stack Overflow" for the comment directing me there), the problem is indeed that the gradient routine doesn't respect the bounds. I wrote a new one that does the job:

import math
import numpy as np
from scipy.optimize import minimize

def gradient_respecting_bounds(bounds, fun, eps=1e-8):
    """bounds: list of tuples (lower, upper)"""
    def gradient(x):
        fx = fun(x)
        grad = np.zeros(len(x))
        for k in range(len(x)):
            d = np.zeros(len(x))
            d[k] = eps if x[k] + eps <= bounds[k][1] else -eps
            grad[k] = (fun(x + d) - fx) / d[k]
        return grad
    return gradient

bounds = ((-1, 1), (-1, 1))
def fun(x):
    print(x)
    return -math.exp(-np.dot(x,x))
result = minimize(fun, [-1, -1], bounds=bounds,
                  jac=gradient_respecting_bounds(bounds, fun))

Note that this can be a bit less efficient, because fun(x) now gets evaluated twice at each point. This seems to be unavoidable, relevant snippet from _minimize_lbfgsb in lbfgsb.py:

if jac is None:
    def func_and_grad(x):
        f = fun(x, *args)
        g = _approx_fprime_helper(x, fun, epsilon, args=args, f0=f)
        return f, g
else:
    def func_and_grad(x):
        f = fun(x, *args)
        g = jac(x, *args)
        return f, g

As you can see, the value of f can only be reused by the internal _approx_fprime_helper function.

Upvotes: 7

Related Questions