Reputation: 45
I get some puzzling result when using the 'L-BFGS-B' method in scipy.optimize.minimize:
import scipy.optimize as optimize
import numpy as np
def testFun():
prec = 1e3
func0 = lambda x: (float(x[0]*prec)/prec+0.5)**2+(float(x[1]*prec)/prec-0.3)**2
func1 = lambda x: (float(round(x[0]*prec))/prec+0.5)**2+(float(round(x[1]*prec))/prec-0.3)**2
result0 = optimize.minimize(func0, np.array([0,0]), method = 'L-BFGS-B', bounds=((-1,1),(-1,1)))
print result0
print 'func0 at [0,0]:',func0([0,0]),'; func0 at [-0.5,0.3]:',func0([-0.5,0.3]),'\n'
result1 = optimize.minimize(func1, np.array([0,0]), method = 'L-BFGS-B', bounds=((-1,1),(-1,1)))
print result1
print 'func1 at [0,0]:',func1([0,0]),'; func1 at [-0.5,0.3]:',func1([-0.5,0.3])
def main():
testFun()
func0() and func1() are almost identical quadratic functions with only a precision difference of 0.001 for input values. 'L-BFGS-B' method works well for func0. However, by just adding a round() function in func1(), 'L-BFGS-B' stops to search for optimal values after first step and directly use initial value [0,0] as the optimal point.
This is not just restricted to round(). Replace round() in func1() as int() also results in the same error.
Does anyone know the reason for this?
Thanks a lot.
Upvotes: 3
Views: 7204
Reputation: 54330
BFGS method is one of those method that relies on not only the function value, but also the gradient and Hessian (think of it as first and second derivative if you wish). In your func1()
, once you have round()
in it, the gradient is no longer continuous. BFGS method therefore fails right after the 1st iteration (think of as this: BFGS searched around the starting parameter and found the gradient is not changed, so it stopped). Similarly, I would expect other methods requiring gradient fail as BGFS.
You may be able to get it working by precondition or rescaling X. But better yet, you should try gradient free method such as 'Nelder-Mead' or 'Powell'
Upvotes: 12
Reputation: 114781
round
and int
create step functions, which are not differentiable. The l-bfgs-b method is for solving smooth optimization problems. It uses an approximate gradient (if you don't give it an explicit one), and that will be garbage if the function has steps.
Upvotes: 6