Reputation: 41

numpy.polyfit: How to get 1-sigma uncertainty around the estimated curve?

I use numpy.polyfit to fit observations. polyfit gives me the estimated coefficients of the polynomial and can also provides me the error covariance matrix of the estimated coefficients. Fine. Now, I would like to know if there is a way to estimate the +/- 1sigma uncertainty around the estimated curve.

I know MatLab can do it (https://stats.stackexchange.com/questions/56596/finding-uncertainty-in-coefficients-from-polyfit-in-matlab) but I did not found a way to make it in python.

Upvotes: 4

Answers (1)

Dietrich

Reputation: 5531

If you have enough data points, you can get with the parameter cov=True an estimated covariance matrix from polyfit(). Remember that you can write a polynomial p[0]*t**n + p[1]*t**(n-1) + ... + p[n] as the matrix product np.dot(tt, p) with tt=[t**n, tt*n-1, ..., 1]. t can be either a single value or a column vector. Since this a linear equation, with the covariance matrix C_p of p, the covariance matrix of the values is np.dot(tt, np.dot(C_p tt.T)).

So a simple example

import numpy as np
import matplotlib.pyplot as plt

# sample data:
x = np.array([0.0, 1.0, 2.0, 3.0,  4.0,  5.0,  6.0])
y = np.array([0.0, 0.8, 0.9, 0.1, -0.8, -1.0, -3.0])

n = 3  # degree of polynomial
p, C_p = np.polyfit(x, y, n, cov=True)  # C_z is estimated covariance matrix

# Do the interpolation for plotting:
t = np.linspace(-0.5, 6.5, 500)
# Matrix with rows 1, t, t**2, ...:
TT = np.vstack([t**(n-i) for i in range(n+1)]).T
yi = np.dot(TT, p)  # matrix multiplication calculates the polynomial values
C_yi = np.dot(TT, np.dot(C_p, TT.T)) # C_y = TT*C_z*TT.T
sig_yi = np.sqrt(np.diag(C_yi))  # Standard deviations are sqrt of diagonal

# Do the plotting:
fg, ax = plt.subplots(1, 1)
ax.set_title("Fit for Polynomial (degree {}) with $\pm1\sigma$-interval".format(n))
ax.fill_between(t, yi+sig_yi, yi-sig_yi, alpha=.25)
ax.plot(t, yi,'-')
ax.plot(x, y, 'ro')
ax.axis('tight')

fg.canvas.draw()
plt.show()

gives Polyfit with 1-sigma intervals

Note that calculating the complete matrix C_yi is computationally and memorywise not very efficient.

Update - on the request on @oliver-w a couple words on the methodology:

polyfit assumes that the parameters x_i are deterministic and y_i are uncorrelated random variables with the expected value y_i and identical variance sigma. So it is a linear estimation problem and an ordinary least squares method can be used. By determining the sample variance of the residues, sigma can be approximated. Based onsigma the covariance matrix of pp can be calculated as shown in the Wikipedia article on Least Squares. That is almost the method that polyfit() uses: There for sigma the more conservative factor S/(n-m-2) instead of S/(n-m) is used.

Upvotes: 10

numpy.polyfit: How to get 1-sigma uncertainty around the estimated curve?

Answers (1)

Related Questions