peter.petrov
peter.petrov

Reputation: 39477

curve_fit - question about var/covar matrix

I am using curve_fit to fit a curve to some set of data points (x,y) in the 2D space. curve_fit has this p0 parameter as we know.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html

The second thing returned by curve_fit is pcov and when I take the diagonal of pcov and square root it, I get a vector v of values.

Then I sum all these values (from this vector v) and I get a number S: something which I interpret (correct or not, I am not sure?!) as an overall std.dev. (or say sum of std.deviations).

I am noticing that when I vary p0 I get different curves and they have different S values. But also, sometimes I think the curves look visually not much different but their S values differ a lot.

I don't fully understand this pcov matrix, hence my confusion. It is variance-covariance matrix of what?!

My question is this: this S value

1) is it a measure of how well my curve fits to the data?

or

2) is it more like a measure of how fast the optimization process (which happens inside curve_fit) converges (given the particular p0 value which I used)?

I hope it is 1) and thus I can use this number S as a quality measure for the curve fitting process.

Is that so or not?

Also, any explanations related to the above mentioned doubts would be much appreciated.

Upvotes: 1

Views: 1699

Answers (2)

Raf
Raf

Reputation: 1757

In my understanding it is more like your 1st point.

From the docs:

The estimated covariance of popt. The diagonals provide the variance of the parameter estimate. To compute one standard deviation errors on the parameters use perr = np.sqrt(np.diag(pcov)).

So the smaller pcov the smaller the error on the parameter estimate.

-- edit ---

pcov indicate how certain the solver is that the provided parameters, popt, are optimum and unique. This doesn't necessarily means that they provide a good fit to the data.

Upvotes: 0

mikuszefski
mikuszefski

Reputation: 4043

Actually neither 1) nor 2). The covariance matrix gives the errors and correlations of fit parameters. You can have large errors while the fit is "looking good". Large errors mean that the prediction of the model is not very good. When two parameters are basically doing the same thing correlation (off diagonal elements) can be very large, while the model is still good. The goodness of a fit is better checked, in a first approach, be a reduced chi-square. Most fitting is done by minimizing chi-square. The speed of convergence depends on many things including the complexity of the high dimensional ( dimensions as number of parameters) chi-square hyper-surface, while the error is only estimated from the curvature of an approximately parabolic (local) minimum.

When going into detail it becomes complicated very fast, but as a rough idea, that's it, somehow.

Adendum

The following question came up: Is it that a small S usually indicates a good fit, while a good looking fit not necessarily has a small S?

In the fitting process the best parameters are eventually a function of the input parameters. The errors of a parameter reflects how a parameter would change if the input data would change slightly. Say we have a + b x_i = y_i (this is linear but one can generalize it. we assume no error in x). Then we will have a = f( x, y ). The error s_a is related to d/d y_i f(x, y) Actually, it is error propagation -> how do errors in y_i affect a etc. So most of the time I'd say that a small error means a good fit, in a linear system for sure. In strange non-linear cases I am quite certain that one can construct a case (steep local minimum) where the variations of parameters is small with respect to variations in input values, while the fit itself is quite bad. In this case one would have small errors and a bad fit at the same time. One would probably see it in the chi-square value, though.

On the other hand, and as mentioned before, you could have a good looking fit with a very small chi-square but large errors. This is not a problem per se. But if you use the model with the fitted parameters to predict other values, you need to perform error propagation. As a consequence the prediction might be very good in reality, but math tells you to announce a low confidence.

Some math is given here

Upvotes: 1

Related Questions