user2261062
user2261062

Reputation:

scipy minimize not finding solution

I'm trying to solve a set of equations using scipy.minimize, however I'm not getting satisfatory results, so maybe I'm gettting something wrong. I want to solve the following system of equations.

12.25 * (x + y * 2.2 + z * 4.84) - 8.17437483750257 = 0
12.25 * (x + y * 3.1 + z * 9.61) - 21.9317236606432 = 0
12.25 * (x + y * 4   + z * 16)   - 107.574834524443 = 0

Using Wolfram Alpha I get the answers

x=22.626570068753, y=-17.950683342597, z=3.6223614029055

Which indeed solve the system of equations, giving a residual error of

9.407585821463726e-12

Now using scipy.minimize I do:

import numpy as np
from scipy.optimize import fsolve
from scipy.optimize import minimize

def my_func(p):
    points = [8.17437483750257, 21.9317236606432, 107.574834524443]
    h1 = abs(12.25 * (p[0] + p[1] * 2.2 + p[2] * 4.84) - points[0])
    h2 = abs(12.25 * (p[0] + p[1] * 3.1 + p[2] * 9.61) - points[1])
    h3 = abs(12.25 * (p[0] + p[1] * 4   + p[2] * 16)   - points[2])
    return h1 + h2 + h3

ini = np.array([22, -15, 5])   # Initial points close to solution
res = minimize(my_func, ini)
print(res)



  fun: 1.4196640741924451
  hess_inv: array([[ 20.79329103, -14.63447889,   2.36145776],
   [-14.63447889,  10.30037625,  -1.66214485],
   [  2.36145776,  -1.66214485,   0.26822135]])
  jac: array([ 12.25      ,  60.02499545, 254.43249989])
  message: 'Desired error not necessarily achieved due to precision loss.'
  nfev: 261
  nit: 8
  njev: 64
  status: 2
  success: False
    x: array([ 21.39197235, -17.08623345,   3.48344393])

First, It says success=False and second it finds solutions that are not optimal.

Why having initial values that are close to the optimal solution it fails to find those solutions.

Is it something wrong with the definition of the optimizer?

Tried running it giving initial values of [0,0,0] and it just gives awful results

ini = np.array([0, 0, 0])   # Initial points close to solution
res = minimize(my_func, ini)
print(res)

fun: 73.66496363902732
 hess_inv: array([[ 0.98461683, -0.04223651, -0.1207056 ],
       [-0.04223651,  0.88596592, -0.31885642],
       [-0.1207056 , -0.31885642,  0.13448927]])
      jac: array([ 12.25      ,  15.92499924, -18.98750019])
  message: 'Desired error not necessarily achieved due to precision loss.'
     nfev: 164
      nit: 1
     njev: 40
   status: 2
  success: False
        x: array([0.02901304, 0.08994042, 0.29448233])

Note: I don't want to use fsolve to find the solutions, but minimize. The reason is that my real problem involves having more equations than unknowns, so at the end I want a solution that minimizes the errors of all this equations. However since it was not giving good results, I wanted to first test on an easy problem for which an exact solution exists. But even in this case it doesn't work. Once I make it work for this problem I'll expand it adding more equations.

Upvotes: 0

Views: 1145

Answers (2)

nonDucor
nonDucor

Reputation: 2093

Optimisation problems (and solvers) usually benefit from a well behaved (smooth) "optimisation surface". When you use the abs function, it creates a "choppy" surface, with points were the derivatives are not continuous.

If instead of using abs you use a quadratic function (that has the same effect), you get a solution close to what you expect. Just change my_func to:

def my_func(p):
    points = [8.17437483750257, 21.9317236606432, 107.574834524443]
    h1 = (12.25 * (p[0] + p[1] * 2.2 + p[2] * 4.84) - points[0])**2
    h2 = (12.25 * (p[0] + p[1] * 3.1 + p[2] * 9.61) - points[1])**2
    h3 = (12.25 * (p[0] + p[1] * 4   + p[2] * 16)   - points[2])**2
    return h1 + h2 + h3

What I got is:

      fun: 8.437863292878727e-10
 hess_inv: array([[ 0.64753863, -0.43474506,  0.06909179],
       [-0.43474506,  0.29487798, -0.04722923],
       [ 0.06909179, -0.04722923,  0.00761762]])
      jac: array([-6.26698693e-12,  6.22490927e-10, -5.11716516e-10])
  message: 'Optimization terminated successfully.'
     nfev: 55
      nit: 7
     njev: 11
   status: 0
  success: True
        x: array([ 22.62653789, -17.95066124,   3.62235782])

Upvotes: 1

ForceBru
ForceBru

Reputation: 44906

...my real problem involves having more equations than unknowns, so at the end I want a solution that minimizes the errors of all this equations

This sounds a lot like the problem solved in the generalized method of moments (GMM), where you also have more equations than unknowns.

Problems of this kind are usually solved using least squares. Suppose your whole system looks like this:

h1(x, y, z) = 0
h2(x, y, z) = 0
h3(x, y, z) = 0
h4(x, y, z) = 0

It has 3 unknowns and 4 equations. Then your objective function will be:

F(x, y, z) = H(x, y, z)' * W * H(x, y, z)
  • H(x, y, z) is the vector of all hj(x, y, z) above
  • H(x, y, z)' is its transpose
  • W is the weighting matrix

If W is the identity matrix, you get the least squares objective function. Then, F(x, y, z) is a quadratic form (basically a parabola in multiple dimensions), which should be simple to optimize because it's convex and smooth.

Your code uses absolute values like h1 = abs(12.25 * (p[0] + p[1] * 2.2 + p[2] * 4.84) - points[0]), but abs can be difficult to differentiate near the origin, yet that's exactly where your optimum lies, since you essentially want h1 to equal zero.

You can approximate the absolute value function by squaring the error:

h1 =(12.25 * (p[0] + p[1] * 2.2 + p[2] * 4.84) - points[0])**2

This results in basically the same approach as the GMM (or least squares) and gives you a function that's easy to optimize since the square is smooth near the origin.

Upvotes: 1

Related Questions