Reputation: 323
I am trying to use scipy.optimize.minimize
with simple a <= x <= b
bounds. However, it often happens that my target function is evaluated just outside the bounds. To my understanding, this happens when minimize
tries to determine the gradient of the target function at the boundary.
Minimal example:
import math
import numpy as np
from scipy.optimize import Bounds, minimize
constraint = Bounds([-1, -1], [1, 1], True)
def fun(x):
print(x)
return -math.exp(-np.dot(x,x))
result = minimize(fun, [-1, -1], bounds=constraint)
The output shows that the minimizer jumps to the point [1,1]
and then tries to evaluate at [1.00000001, 1]
:
[-1. -1.]
[-0.99999999 -1. ]
[-1. -0.99999999]
[-0.72932943 -0.72932943]
[-0.72932942 -0.72932943]
[-0.72932943 -0.72932942]
[-0.22590689 -0.22590689]
[-0.22590688 -0.22590689]
[-0.22590689 -0.22590688]
[1. 1.]
[1.00000001 1. ]
[1. 1.00000001]
[-0.03437328 -0.03437328]
...
Of course, there is no problem in this example, as fun
can be evaluated also there. But that might not always be the case...
In my actual problem, the minimum can not be on the boundary and I have the easy workaround of adding an epsilon to the bounds. But one would expect that there should be an easy solution to this issue which also works if the minimum can be at a boundary?
PS: It would be strange if I were the first to have this problem -- sorry if this question has been asked before somewhere, but I didn't find it anywhere.
Upvotes: 9
Views: 7215
Reputation: 323
As discussed here (thanks @"Welcome to Stack Overflow" for the comment directing me there), the problem is indeed that the gradient routine doesn't respect the bounds. I wrote a new one that does the job:
import math
import numpy as np
from scipy.optimize import minimize
def gradient_respecting_bounds(bounds, fun, eps=1e-8):
"""bounds: list of tuples (lower, upper)"""
def gradient(x):
fx = fun(x)
grad = np.zeros(len(x))
for k in range(len(x)):
d = np.zeros(len(x))
d[k] = eps if x[k] + eps <= bounds[k][1] else -eps
grad[k] = (fun(x + d) - fx) / d[k]
return grad
return gradient
bounds = ((-1, 1), (-1, 1))
def fun(x):
print(x)
return -math.exp(-np.dot(x,x))
result = minimize(fun, [-1, -1], bounds=bounds,
jac=gradient_respecting_bounds(bounds, fun))
Note that this can be a bit less efficient, because fun(x)
now gets evaluated twice at each point.
This seems to be unavoidable, relevant snippet from _minimize_lbfgsb
in lbfgsb.py:
if jac is None:
def func_and_grad(x):
f = fun(x, *args)
g = _approx_fprime_helper(x, fun, epsilon, args=args, f0=f)
return f, g
else:
def func_and_grad(x):
f = fun(x, *args)
g = jac(x, *args)
return f, g
As you can see, the value of f
can only be reused by the internal _approx_fprime_helper
function.
Upvotes: 7