adammoyle
adammoyle

Reputation: 309

Errors on parameters using scipy.curve_fit

I am fitting the following function (variables A, D, μ and τ) and x and E are fixed:
enter image description here

I created some example data using the equation and added some noise. The fit looks very good and has a low chi-squared however the errors from the covariance matrix are odd; some are very large whereas others are smaller. What am I doing wrong?

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

# Constants
E_field = 1
x = 1

def function(t, A, D, μ, τ):
  return A/np.sqrt(4*np.pi*D*t) * np.exp(-pow(x-μ*E_field*t, 2)/(4*D*t) - t/τ)

def chi(E, O):
  return np.sum(np.ma.masked_invalid(pow(O-E, 2)/E))

def fit(t, n, m, p0):
  ddof = n.size - m
  popt, pcov = curve_fit(function, t, n, p0=p0)
  fitted_n = function(t, *popt)
  reduced_χ_squared = chi(n, fitted_n) / ddof
  σ = np.sqrt(np.diag(pcov))
  return popt, σ, reduced_χ_squared

# Choose random variables to generate data
x, t = 1, np.linspace(0.01, 5, num=100)
A, D, μ, τ = 1, 0.2, 1, 1
n = function(t, A, D, μ, τ)

n_noise = n + 0.005 * np.random.normal(size=n.size)
n_noise += abs(min(n_noise)) # Shift data to lie on y = 0

p0 = [1, 0.25, 1, 1]
vars, σ, reduced_χ_squared = fit(t, n_noise, 4, p0)
fitted_A, fitted_D, fitted_μ, fitted_τ = vars
σ_A, σ_D, σ_μ, σ_τ = σ
fitted_n = function(t, *vars)

fig, ax = plt.subplots()

ax.plot(t, n_noise)
ax.plot(t, fitted_n)
#ax.text(0.82, 0.75, "χᵣ²={:.4f}".format(reduced_χ_squared), transform = ax.transAxes)
ax.legend(["Observed n", "Expected n"])
print("Fitted parameters: A = {:.4f}, D = {:.4f}, μ = {:.4f}, τ = {:.4f}".format(*vars))
print("Fitted parameter errors: σ_A = {:.4f}, σ_D = {:.4f}, σ_μ = {:.4f}, σ_τ = {:.4f}".format(*σ))
print("Reduced χ² = {:.4f}".format(reduced_χ_squared))

Running this code gives me the following output
Fitted function overlayed on "observed" data

Upvotes: 0

Views: 1501

Answers (1)

mikuszefski
mikuszefski

Reputation: 4043

As mentioned in my comment above, correlation is a big problem here. Biggest problem though is that you fit more parameters than required.

Let us transform:

A = exp( alpha) i.e alpha = log(A)
delta = 4 * D
epsilon = mu * E

We then get:

1 / sqrt( pi* delta ) *  exp( -( x**2 + epsilon**2 * t**2 -2*x*epsilon t) / ( delta * t ) -t / tau + alpha )

= 1 / sqrt( pi* delta ) * exp( -( x**2 + epsilon**2 * t**2 -2*x*epsilon t) / ( delta * t ) -delta / tau * t**2/( delta * t) + delta * alpha * t/ ( delta * t ) )

= 1 / sqrt( pi* delta ) * exp( -( x**2 + epsilon**2 * t**2 -2*x*epsilon t + delta / tau * t**2 - delta * alpha * t ) / ( delta * t ) )

= 1 / sqrt( pi* delta ) * exp( -( x**2 + ( epsilon**2 + delta / tau ) * t**2 -x * ( 2 * epsilon + delta * alpha ) * t ) / ( delta * t ) )

now renaming:

( epsilon**2 + delta / tau ) -> gamma**2

( 2 * epsilon + delta * alpha ) -> eta

we get

= 1 / sqrt( pi * delta ) * exp( -( x**2 + gamma**2 * t**2 - x * eta * t ) / ( delta * t ) )

So there are actually only 3 parameters to fit and it looks like this:



import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

# Constants
E_field = 1
x = 1

def function(t, A, D, μ, τ):
  return A/np.sqrt(4*np.pi*D*t) * np.exp(-pow(x-μ*E_field*t, 2)/(4*D*t) - t/τ)

def alt_func( t, gamma, eta, delta ):
    return np.exp( -( x**2 + gamma**2 * t**2 - eta * t ) / ( delta * t ) ) / np.sqrt( np.pi * delta * t )


# Choose random variables to generate data
x, t = 1, np.linspace(0.01, 5, num=100)
A, D, μ, τ = 1, 0.2, 1, 1
n = function(t, A, D, μ, τ)

n_noise = n + 0.005 * np.random.normal(size=n.size)
n_noise += abs(min(n_noise)) # Shift data to lie on y = 0

guess=[1.34, 2, .8]
palt, covalt = curve_fit( alt_func, t, n_noise)
print( covalt )
print( palt )
yt = alt_func( t, *palt )
yg = alt_func( t, *guess )
yorg = function( t, A, D, μ, τ )

fig, ax = plt.subplots()

ax.plot(t, n_noise)
ax.plot(t, yg )
ax.plot(t, yt, ls="--")
ax.plot(t, yorg, ls=":" )
plt.show()

This has a reasonable covariance matrix. One can get the original parameters easily via error propagation.

Altzernatively, it should be enough to fix A=1 and only fit the three left parameters in the original function.

Concerning the transformation and back calculation one has to keep in mind that this is of course from R³ to R⁴, so it is naturally not unique either. Again one can just fix one value, or one might to try to spread the error evenly between the parameters or who knows....

Upvotes: 1

Related Questions