user19346
user19346

Reputation: 303

Getting covariance matrix of fitted parameters from scipy optimize.least_squares method

I am using scipy.optimize's least_squares method in order to perform a constrained non-linear least squares optimization. I was wondering how I would go about getting the covariance matrix of the fitted parameters in order to get error bars on the fit parameters?

This seems to be pretty clear for curve_fit and leastsq, but not so clear (at least to me) for the least_squares method.

One approach I have been doing is since I know that least_squares returns the Jacobian matrix J (which is the "jac" return value), then what I do is I approximate the Hessian H by 2*J^T J. Finally, the covariance matrix is H^{-1}, which is therefore approximately (2*J^T J)^{-1}, but I'm worried this may be too coarse of an approximation for the covariance?

Upvotes: 6

Views: 8140

Answers (2)

Puco4
Puco4

Reputation: 633

Actually in optimize.least_squares I recover the same errors both from optimize.leastsq and optimize.curve_fit using:

hess_inv = (J.T J)^{-1}

They explain this approximation in: Why is the approximation of Hessian=JT J reasonable?

On the other hand, I recover the same errors from optimize.minimize minimizing by least squares and using the expression you mentioned:

hess_inv = (2 J.T J)^{-1}


Code

Using this example and calculating the errors of the fit as the square root of the diagonal elements of the covariance, which is given by hess_inv * residual variance (as in this post):

import numpy as np
from scipy import optimize
import random

np.random.seed(0)

#True parameters
def f( x, p):
    return p[0]*x + 0.4*np.sin(p[1]*x) + p[2]
p = np.array([1, 40, 2])

print('True p: ', p)

#Generate random data
xdata = np.linspace(0., 1, 120)
ydata = f(xdata, p) +  np.random.normal(0., 0.2, len(xdata))

#Fits
pstart = [1,42,1]
errFunc = lambda p, x, y: f(x,p) - y

#Error of parameters
def errFit(hess_inv, resVariance):
    return np.sqrt( np.diag( hess_inv * resVariance))

#optimize.leastsq
fit, hess_inv, infodict, errmsg, success = optimize.leastsq(errFunc, pstart, args=(xdata, ydata), full_output=1)
dFit = errFit(hess_inv, (errFunc(fit, xdata, ydata)**2).sum()/(len(ydata)-len(pstart) ) )
print('leastsq:\n\tp: ' , fit, '\n\tdp: ', dFit)

#optimize.curve_fit
fit, cov = optimize.curve_fit(lambda x, p0, p1, p2: f(x, [p0, p1, p2]), xdata, ydata, p0=pstart)
dFit = np.sqrt(np.diag(cov))
print('curve_fit:\n\tp: ' , fit, '\n\tdp: ', dFit)

#optimize.minimize
result = optimize.minimize(lambda p,x,y: np.sum( errFunc(p,x,y)**2 ), pstart,  args=(xdata, ydata))
dFit = errFit( result.hess_inv,  result.fun/(len(ydata)-len(pstart)) )
print('minimize:\n\tp: ' , result.x, '\n\tdp: ', dFit)

#optimize.least_squares
result = optimize.least_squares(errFunc, pstart, args=(xdata, ydata))
dFit = errFit( np.linalg.inv(np.dot(result.jac.T, result.jac)), (errFunc(result.x, xdata, ydata)**2).sum()/(len(ydata)-len(pstart) ) ) 
dFit2 = errFit( np.linalg.inv(2 * np.dot(result.jac.T, result.jac)), (errFunc(result.x, xdata, ydata)**2).sum()/(len(ydata)-len(pstart) ) ) 
print('least_squares:\n\tp: ' , result.x, '\n\tdp: ', dFit, '\n\tdp2: ', dFit2)

Output:

True p:  [ 1 40  2]
leastsq:
    p:  [ 1.0309677  40.11550517  2.01042538] 
    dp:  [0.06615346 0.11975949 0.03823861]
curve_fit:
    p:  [ 1.0309677  40.11550517  2.01042538] 
    dp:  [0.06615346 0.11975949 0.03823861]
minimize:
    p:  [ 1.03096783 40.1155004   2.01042534] 
    dp:  [0.04706757 0.08684545 0.02719714]
least_squares:
    p:  [ 1.03096771 40.11550517  2.01042538] 
    dp:  [0.06615349 0.11975979 0.03823862] 
    dp2:  [0.04677758 0.08468296 0.02703878]

Upvotes: 2

ev-br
ev-br

Reputation: 26030

curve_fit itself uses a pseudoinverse of the jacobian from least_squares, https://github.com/scipy/scipy/blob/2526df72e5d4ca8bad6e2f4b3cbdfbc33e805865/scipy/optimize/minpack.py#L739

One thing to note is that this whole approach is dubious if the result is close to the bounds.

Upvotes: 2

Related Questions