Reputation: 41
I use numpy.polyfit to fit observations. polyfit gives me the estimated coefficients of the polynomial and can also provides me the error covariance matrix of the estimated coefficients. Fine. Now, I would like to know if there is a way to estimate the +/- 1sigma uncertainty around the estimated curve.
I know MatLab can do it (https://stats.stackexchange.com/questions/56596/finding-uncertainty-in-coefficients-from-polyfit-in-matlab) but I did not found a way to make it in python.
Upvotes: 4
Views: 10516
Reputation: 5531
If you have enough data points, you can get with the parameter cov=True
an estimated covariance matrix from polyfit()
. Remember that you can write a polynomial p[0]*t**n + p[1]*t**(n-1) + ... + p[n]
as the matrix product np.dot(tt, p)
with tt=[t**n, tt*n-1, ..., 1]
. t
can be either a single value or a column vector. Since this a linear equation, with the covariance matrix C_p
of p
, the covariance matrix of the values is np.dot(tt, np.dot(C_p tt.T))
.
So a simple example
import numpy as np
import matplotlib.pyplot as plt
# sample data:
x = np.array([0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0])
y = np.array([0.0, 0.8, 0.9, 0.1, -0.8, -1.0, -3.0])
n = 3 # degree of polynomial
p, C_p = np.polyfit(x, y, n, cov=True) # C_z is estimated covariance matrix
# Do the interpolation for plotting:
t = np.linspace(-0.5, 6.5, 500)
# Matrix with rows 1, t, t**2, ...:
TT = np.vstack([t**(n-i) for i in range(n+1)]).T
yi = np.dot(TT, p) # matrix multiplication calculates the polynomial values
C_yi = np.dot(TT, np.dot(C_p, TT.T)) # C_y = TT*C_z*TT.T
sig_yi = np.sqrt(np.diag(C_yi)) # Standard deviations are sqrt of diagonal
# Do the plotting:
fg, ax = plt.subplots(1, 1)
ax.set_title("Fit for Polynomial (degree {}) with $\pm1\sigma$-interval".format(n))
ax.fill_between(t, yi+sig_yi, yi-sig_yi, alpha=.25)
ax.plot(t, yi,'-')
ax.plot(x, y, 'ro')
ax.axis('tight')
fg.canvas.draw()
plt.show()
gives
Note that calculating the complete matrix C_yi
is computationally and memorywise not very efficient.
Update - on the request on @oliver-w a couple words on the methodology:
polyfit
assumes that the parameters x_i
are deterministic and y_i
are uncorrelated random variables with the expected value y_i
and identical variance sigma
. So it is a linear estimation problem and an ordinary least squares method can be used. By determining the sample variance of the residues, sigma
can be approximated. Based onsigma
the covariance matrix of pp
can be calculated as shown in the Wikipedia article on Least Squares. That is almost the method that polyfit()
uses: There for sigma
the more conservative factor S/(n-m-2)
instead of S/(n-m)
is used.
Upvotes: 10