Reputation: 39477
I am using curve_fit
to fit a curve to some set of data points (x,y)
in the 2D space. curve_fit
has this p0
parameter as we know.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html
The second thing returned by curve_fit
is pcov
and when I take the diagonal of pcov
and square root it, I get a vector v
of values.
Then I sum all these values (from this vector v
) and I get a number S
: something which I interpret (correct or not, I am not sure?!) as an overall std.dev
. (or say sum of std.deviations
).
I am noticing that when I vary p0
I get different curves and they have different S
values. But also, sometimes I think the curves look visually not much different but their S
values differ a lot.
I don't fully understand this pcov
matrix, hence my confusion. It is variance-covariance matrix of what?!
My question is this: this S
value
1)
is it a measure of how well my curve fits to the data?
or
2)
is it more like a measure of how fast the optimization process (which happens inside curve_fit
) converges (given the particular p0
value which I used)?
I hope it is 1) and thus I can use this number S
as a quality measure for the curve fitting process.
Is that so or not?
Also, any explanations related to the above mentioned doubts would be much appreciated.
Upvotes: 1
Views: 1699
Reputation: 1757
In my understanding it is more like your 1st point.
From the docs:
The estimated covariance of popt. The diagonals provide the variance of the parameter estimate. To compute one standard deviation errors on the parameters use perr = np.sqrt(np.diag(pcov)).
So the smaller pcov
the smaller the error on the parameter estimate.
-- edit ---
pcov
indicate how certain the solver is that the provided parameters, popt
, are optimum and unique. This doesn't necessarily means that they provide a good fit to the data.
Upvotes: 0
Reputation: 4043
Actually neither 1) nor 2). The covariance matrix gives the errors and correlations of fit parameters. You can have large errors while the fit is "looking good". Large errors mean that the prediction of the model is not very good. When two parameters are basically doing the same thing correlation (off diagonal elements) can be very large, while the model is still good. The goodness of a fit is better checked, in a first approach, be a reduced chi-square. Most fitting is done by minimizing chi-square. The speed of convergence depends on many things including the complexity of the high dimensional ( dimensions as number of parameters) chi-square hyper-surface, while the error is only estimated from the curvature of an approximately parabolic (local) minimum.
When going into detail it becomes complicated very fast, but as a rough idea, that's it, somehow.
Adendum
The following question came up: Is it that a small S
usually indicates a good fit, while a good looking fit not necessarily has a small S
?
In the fitting process the best parameters are eventually a function of the input parameters. The errors of a parameter reflects how a parameter would change if the input data would change slightly. Say we have a + b x_i = y_i
(this is linear but one can generalize it. we assume no error in x
). Then we will have a = f( x, y )
. The error s_a
is related to d/d y_i f(x, y)
Actually, it is error propagation -> how do errors in y_i
affect a
etc. So most of the time I'd say that a small error means a good fit, in a linear system for sure. In strange non-linear cases I am quite certain that one can construct a case (steep local minimum) where the variations of parameters is small with respect to variations in input values, while the fit itself is quite bad. In this case one would have small errors and a bad fit at the same time. One would probably see it in the chi-square value, though.
On the other hand, and as mentioned before, you could have a good looking fit with a very small chi-square but large errors. This is not a problem per se. But if you use the model with the fitted parameters to predict other values, you need to perform error propagation. As a consequence the prediction might be very good in reality, but math tells you to announce a low confidence.
Some math is given here
Upvotes: 1