Dole
Dole

Reputation: 339

Scipy won't find the optimum (simple cosine function)

I am trying to estimate the argument of a cosine function using the scipy optimizer (yes I am aware arc cos could be used, but I don't want to do that).

The code + a demonstration:

import numpy
import scipy

def solver(data):
    Z=numpy.zeros(len(data))
    a=0.003
    for i in range(len(data)):
        def minimizer(b):
            return numpy.abs(data[i]-numpy.cos(b))
        Z[i]=scipy.optimize.minimize(minimizer,a,bounds=[(0,numpy.pi)],method="L-BFGS-B").x[0]
    return Z

Y=numpy.zeros(100)
for i in range(100):
   Y[i]=numpy.cos(i/25)

solver(Y)

The result is not good, when the argument of the cos function reaches values above 2, the estimation "skips over" the values and returns the maximum argument value instead.

array([0.        , 0.04      , 0.08      , 0.12      , 0.16      ,
       0.2       , 0.24      , 0.28      , 0.32      , 0.36      ,
       0.4       , 0.44      , 0.48      , 0.52      , 0.56      ,
       0.6       , 0.64      , 0.67999999, 0.72      , 0.75999999,
       0.8       , 0.83999999, 0.88      , 0.92      , 0.95999999,
       1.        , 1.04      , 1.08      , 1.12      , 1.16      ,
       1.2       , 1.24      , 1.28      , 1.32      , 1.36      ,
       1.4       , 1.44      , 1.48      , 1.52      , 1.56      ,
       1.6       , 1.64      , 1.68      , 1.72      , 1.76      ,
       1.8       , 1.84      , 1.88      , 1.91999999, 1.95999999,
       2.        , 2.04      , 3.14159265, 3.14159265, 3.14159265,
       3.14159265, 3.14159265, 3.14159265, 3.14159265, 3.14159265,...

What causes this phenomenon? Are there some other optimizers/ settings that could help with the issue?

Upvotes: 3

Views: 2093

Answers (2)

user6655984
user6655984

Reputation:

Besides the general minimize method, SciPy has minimize_scalar specifically for 1-dimensional problems like here, and least_squares for minimizing a particular kind of functions that measure the difference between two quantities (such as the difference between cos(b) and diff[i] here). The latter performs well here, even without fine-tuning.

for i in range(len(data)):
    Z[i] = scipy.optimize.least_squares(lambda b: data[i] - numpy.cos(b), a, bounds=(0, numpy.pi)).x[0]

The function passed to least_squares is the thing we'd like to be close to 0, without an absolute value on it. I'll add that a=0.003 seems a suboptimal choice for a starting point, being so close to the boundary; nonetheless it works.

Also, as a_guest already posted, a scalar root finding method should do the same thing while throwing fewer surprises here, given that we already have a nice bracketing interval [0, pi]. Bisection is reliable but slow; Brent's method is what I'd probably use.

for i in range(len(data)):
    Z[i] = scipy.optimize.brentq(lambda b: data[i] - numpy.cos(b), 0, numpy.pi)

Upvotes: 3

a_guest
a_guest

Reputation: 36299

The reason is that for the function (for example) f = abs(cos(0.75*pi) - cos(z)) the gradient f' happens to vanish at z = pi, as can be seen from the following plot:

enter image description here

If you check the result the optimization procedure then you'll see that:

      fun: array([0.29289322])
 hess_inv: <1x1 LbfgsInvHessProduct with dtype=float64>
      jac: array([0.])
  message: b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
     nfev: 16
      nit: 2
   status: 0
  success: True
        x: array([3.14159265])

So the optimization procedure reached one of its convergence criteria. More detailed information about the criterion can be found at the L-BFGS-B documentation. It says that

gtol : float

The iteration will stop when max{|proj g_i | i = 1, ..., n} <= gtol where pg_i is the i-th component of the projected gradient.

So it eventually reaches a point z >= pi which is then projected back to z = pi due to the constraint and at this point the gradient of the function is zero and hence it stops. You can observe that by registering a callback which prints the current parameter vector:

def new_callback():
    step = 1

    def callback(xk):
        nonlocal step
        print('Step #{}: xk = {}'.format(step, xk))
        step += 1

    return callback

scipy.optimize.minimize(..., callback=new_callback())

Which outputs:

Step #1: xk = [0.006]
Step #2: xk = [3.14159265]

So at the second step it hit z >= pi which is projected back to z = pi.

You can circumvent this problem by reducing the bounds to for example bounds=[(0, 0.99*np.pi)]. This will give you the expected result, however the method won't converge; you will see something like:

      fun: array([1.32930966e-09])
 hess_inv: <1x1 LbfgsInvHessProduct with dtype=float64>
      jac: array([0.44124484])
  message: b'ABNORMAL_TERMINATION_IN_LNSRCH'
     nfev: 160
      nit: 6
   status: 2
  success: False
        x: array([2.35619449])

Note the message ABNORMAL_TERMINATION_IN_LNSRCH. This is due to the nature of abs(x) and the fact that its derivative has a discontinuity at x = 0 (you can read more about that here).

Alternative approach (finding the root)

For all the lines above we were trying to find a value z for which cos(z) == cos(0.75*pi) (or abs(cos(z) - cos(0.75*pi)) < eps). This problem is actually finding the root of the function f = cos(z) - cos(0.75*pi) where we can make use of the fact that cos is a continuous function. We need to set the boundaries a, b such that f(a)*f(b) < 0 (i.e. they have opposite sign). For example using bisect method:

res = scipy.optimize.bisect(f, 0, np.pi)

Upvotes: 3

Related Questions