Gabriel
Gabriel

Reputation: 42389

Estimate the standard deviation of fitted parameters in scipy.odr?

(Somewhat related to this question Linear fit including all errors with NumPy/SciPy, and borrowing code from this one Linear fitting in python with uncertainty in both x and y coordinates)

I fit a linear model (y=a*x+b) using fixed errors in x,y using scipy.odr (code is below), and I get:

Parameters (a, b): [ 5.21806759 -4.08019995]
Standard errors: [ 0.83897588  2.33472161]
Squared diagonal covariance: [ 1.06304228  2.9582588 ]

What is the correct standard deviation values for the fitted a, b parameters? I'm assuming these must be obtained from the Squared diagonal covariance values, but then how are these values related to the Standard errors?

ADD

As mentioned in the answer to How to compute standard error from ODR results? by ali_m, this is apparently related to a bug in scipy.odr. If one uses

np.sqrt(np.diag(out.cov_beta * out.res_var))

(i.e: multiply the covariance by the residual variance) instead of just

np.sqrt(np.diag(out.cov_beta))

the result now coincides with out.sd_beta.

So now my question is: which is the proper standard deviation for the fitted parameters (a, b)? Is it out.sd_beta (equivalently: np.sqrt(np.diag(out.cov_beta * out.res_var))) or np.sqrt(np.diag(out.cov_beta))?





import numpy as np
from scipy.odr import Model, RealData, ODR
import random

random.seed(9001)
np.random.seed(117)

def getData(c):
    """Initiate random data."""
    x = np.array([0, 1, 2, 3, 4, 5])
    y = np.array([i**2 + random.random() for i in x])
    xerr = c * np.array([random.random() for i in x])
    yerr = c * np.array([random.random() for i in x])
    return x, y, xerr, yerr

def linear_func(p, x):
    """Linear model."""
    a, b = p
    return a * x + b

def fitModel(x, y, xerr, yerr):
    # Create a model for fitting.
    linear_model = Model(linear_func)
    # Create a RealData object using our initiated data from above.
    data = RealData(x, y, sx=xerr, sy=yerr)
    # Set up ODR with the model and data.
    odr = ODR(data, linear_model, beta0=[0., 1.])
    # Run the regression.
    out = odr.run()

    # Estimated parameter values
    beta = out.beta
    print("Parameters (a, b): {}".format(beta))
    # Standard errors of the estimated parameters
    std = out.sd_beta
    print("Standard errors: {}".format(std))
    # Covariance matrix of the estimated parameters
    cov = out.cov_beta
    stddev = np.sqrt(np.diag(cov))
    print("Squared diagonal covariance: {}".format(stddev))


# Generate data and fit the model.
x, y, xerr, yerr = getData(1.)
fitModel(x, y, xerr, yerr)

Upvotes: 1

Views: 2400

Answers (1)

ali_m
ali_m

Reputation: 74232

Yes, out.sd_beta contains the standard deviations for the estimated parameters, which are equivalent to the square roots of the diagonal terms in the parameter covariance matrix.

As you've already mentioned above, there's a bug in scipy.odr that means you have to multiply out.cov_beta by the residual variance out.res_var in order to derive the actual covariance matrix for the parameters.

Upvotes: 3

Related Questions