Reputation: 309
I am fitting the following function (variables A, D, μ and τ) and x and E are fixed:
I created some example data using the equation and added some noise. The fit looks very good and has a low chi-squared however the errors from the covariance matrix are odd; some are very large whereas others are smaller. What am I doing wrong?
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
# Constants
E_field = 1
x = 1
def function(t, A, D, μ, τ):
return A/np.sqrt(4*np.pi*D*t) * np.exp(-pow(x-μ*E_field*t, 2)/(4*D*t) - t/τ)
def chi(E, O):
return np.sum(np.ma.masked_invalid(pow(O-E, 2)/E))
def fit(t, n, m, p0):
ddof = n.size - m
popt, pcov = curve_fit(function, t, n, p0=p0)
fitted_n = function(t, *popt)
reduced_χ_squared = chi(n, fitted_n) / ddof
σ = np.sqrt(np.diag(pcov))
return popt, σ, reduced_χ_squared
# Choose random variables to generate data
x, t = 1, np.linspace(0.01, 5, num=100)
A, D, μ, τ = 1, 0.2, 1, 1
n = function(t, A, D, μ, τ)
n_noise = n + 0.005 * np.random.normal(size=n.size)
n_noise += abs(min(n_noise)) # Shift data to lie on y = 0
p0 = [1, 0.25, 1, 1]
vars, σ, reduced_χ_squared = fit(t, n_noise, 4, p0)
fitted_A, fitted_D, fitted_μ, fitted_τ = vars
σ_A, σ_D, σ_μ, σ_τ = σ
fitted_n = function(t, *vars)
fig, ax = plt.subplots()
ax.plot(t, n_noise)
ax.plot(t, fitted_n)
#ax.text(0.82, 0.75, "χᵣ²={:.4f}".format(reduced_χ_squared), transform = ax.transAxes)
ax.legend(["Observed n", "Expected n"])
print("Fitted parameters: A = {:.4f}, D = {:.4f}, μ = {:.4f}, τ = {:.4f}".format(*vars))
print("Fitted parameter errors: σ_A = {:.4f}, σ_D = {:.4f}, σ_μ = {:.4f}, σ_τ = {:.4f}".format(*σ))
print("Reduced χ² = {:.4f}".format(reduced_χ_squared))
Running this code gives me the following output
Upvotes: 0
Views: 1501
Reputation: 4043
As mentioned in my comment above, correlation is a big problem here. Biggest problem though is that you fit more parameters than required.
Let us transform:
A = exp( alpha) i.e alpha = log(A)
delta = 4 * D
epsilon = mu * E
We then get:
1 / sqrt( pi* delta ) * exp( -( x**2 + epsilon**2 * t**2 -2*x*epsilon t) / ( delta * t ) -t / tau + alpha )
= 1 / sqrt( pi* delta ) * exp( -( x**2 + epsilon**2 * t**2 -2*x*epsilon t) / ( delta * t ) -delta / tau * t**2/( delta * t) + delta * alpha * t/ ( delta * t ) )
= 1 / sqrt( pi* delta ) * exp( -( x**2 + epsilon**2 * t**2 -2*x*epsilon t + delta / tau * t**2 - delta * alpha * t ) / ( delta * t ) )
= 1 / sqrt( pi* delta ) * exp( -( x**2 + ( epsilon**2 + delta / tau ) * t**2 -x * ( 2 * epsilon + delta * alpha ) * t ) / ( delta * t ) )
now renaming:
( epsilon**2 + delta / tau ) -> gamma**2
( 2 * epsilon + delta * alpha ) -> eta
we get
= 1 / sqrt( pi * delta ) * exp( -( x**2 + gamma**2 * t**2 - x * eta * t ) / ( delta * t ) )
So there are actually only 3 parameters to fit and it looks like this:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
# Constants
E_field = 1
x = 1
def function(t, A, D, μ, τ):
return A/np.sqrt(4*np.pi*D*t) * np.exp(-pow(x-μ*E_field*t, 2)/(4*D*t) - t/τ)
def alt_func( t, gamma, eta, delta ):
return np.exp( -( x**2 + gamma**2 * t**2 - eta * t ) / ( delta * t ) ) / np.sqrt( np.pi * delta * t )
# Choose random variables to generate data
x, t = 1, np.linspace(0.01, 5, num=100)
A, D, μ, τ = 1, 0.2, 1, 1
n = function(t, A, D, μ, τ)
n_noise = n + 0.005 * np.random.normal(size=n.size)
n_noise += abs(min(n_noise)) # Shift data to lie on y = 0
guess=[1.34, 2, .8]
palt, covalt = curve_fit( alt_func, t, n_noise)
print( covalt )
print( palt )
yt = alt_func( t, *palt )
yg = alt_func( t, *guess )
yorg = function( t, A, D, μ, τ )
fig, ax = plt.subplots()
ax.plot(t, n_noise)
ax.plot(t, yg )
ax.plot(t, yt, ls="--")
ax.plot(t, yorg, ls=":" )
plt.show()
This has a reasonable covariance matrix. One can get the original parameters easily via error propagation.
Altzernatively, it should be enough to fix A=1
and only fit the three left parameters in the original function.
Concerning the transformation and back calculation one has to keep in mind that this is of course from R³ to R⁴, so it is naturally not unique either. Again one can just fix one value, or one might to try to spread the error evenly between the parameters or who knows....
Upvotes: 1