Reputation: 303
I am using scipy.optimize's least_squares method in order to perform a constrained non-linear least squares optimization. I was wondering how I would go about getting the covariance matrix of the fitted parameters in order to get error bars on the fit parameters?
This seems to be pretty clear for curve_fit and leastsq, but not so clear (at least to me) for the least_squares method.
One approach I have been doing is since I know that least_squares returns the Jacobian matrix J (which is the "jac" return value), then what I do is I approximate the Hessian H by 2*J^T J. Finally, the covariance matrix is H^{-1}, which is therefore approximately (2*J^T J)^{-1}, but I'm worried this may be too coarse of an approximation for the covariance?
Upvotes: 6
Views: 8140
Reputation: 633
Actually in optimize.least_squares
I recover the same errors both from optimize.leastsq
and optimize.curve_fit
using:
hess_inv = (J.T J)^{-1}
They explain this approximation in: Why is the approximation of Hessian=JT J reasonable?
On the other hand, I recover the same errors from optimize.minimize
minimizing by least squares and using the expression you mentioned:
hess_inv = (2 J.T J)^{-1}
Using this example and calculating the errors of the fit as the square root of the diagonal elements of the covariance, which is given by hess_inv * residual variance
(as in this post):
import numpy as np
from scipy import optimize
import random
np.random.seed(0)
#True parameters
def f( x, p):
return p[0]*x + 0.4*np.sin(p[1]*x) + p[2]
p = np.array([1, 40, 2])
print('True p: ', p)
#Generate random data
xdata = np.linspace(0., 1, 120)
ydata = f(xdata, p) + np.random.normal(0., 0.2, len(xdata))
#Fits
pstart = [1,42,1]
errFunc = lambda p, x, y: f(x,p) - y
#Error of parameters
def errFit(hess_inv, resVariance):
return np.sqrt( np.diag( hess_inv * resVariance))
#optimize.leastsq
fit, hess_inv, infodict, errmsg, success = optimize.leastsq(errFunc, pstart, args=(xdata, ydata), full_output=1)
dFit = errFit(hess_inv, (errFunc(fit, xdata, ydata)**2).sum()/(len(ydata)-len(pstart) ) )
print('leastsq:\n\tp: ' , fit, '\n\tdp: ', dFit)
#optimize.curve_fit
fit, cov = optimize.curve_fit(lambda x, p0, p1, p2: f(x, [p0, p1, p2]), xdata, ydata, p0=pstart)
dFit = np.sqrt(np.diag(cov))
print('curve_fit:\n\tp: ' , fit, '\n\tdp: ', dFit)
#optimize.minimize
result = optimize.minimize(lambda p,x,y: np.sum( errFunc(p,x,y)**2 ), pstart, args=(xdata, ydata))
dFit = errFit( result.hess_inv, result.fun/(len(ydata)-len(pstart)) )
print('minimize:\n\tp: ' , result.x, '\n\tdp: ', dFit)
#optimize.least_squares
result = optimize.least_squares(errFunc, pstart, args=(xdata, ydata))
dFit = errFit( np.linalg.inv(np.dot(result.jac.T, result.jac)), (errFunc(result.x, xdata, ydata)**2).sum()/(len(ydata)-len(pstart) ) )
dFit2 = errFit( np.linalg.inv(2 * np.dot(result.jac.T, result.jac)), (errFunc(result.x, xdata, ydata)**2).sum()/(len(ydata)-len(pstart) ) )
print('least_squares:\n\tp: ' , result.x, '\n\tdp: ', dFit, '\n\tdp2: ', dFit2)
Output:
True p: [ 1 40 2]
leastsq:
p: [ 1.0309677 40.11550517 2.01042538]
dp: [0.06615346 0.11975949 0.03823861]
curve_fit:
p: [ 1.0309677 40.11550517 2.01042538]
dp: [0.06615346 0.11975949 0.03823861]
minimize:
p: [ 1.03096783 40.1155004 2.01042534]
dp: [0.04706757 0.08684545 0.02719714]
least_squares:
p: [ 1.03096771 40.11550517 2.01042538]
dp: [0.06615349 0.11975979 0.03823862]
dp2: [0.04677758 0.08468296 0.02703878]
Upvotes: 2
Reputation: 26030
curve_fit
itself uses a pseudoinverse of the jacobian from least_squares
, https://github.com/scipy/scipy/blob/2526df72e5d4ca8bad6e2f4b3cbdfbc33e805865/scipy/optimize/minpack.py#L739
One thing to note is that this whole approach is dubious if the result is close to the bounds.
Upvotes: 2