Reputation: 45
Trying to fit ExponentialGaussianModel()
from lmfit
but getting the following error message: The input contains nan values
I am using Jupyternotebook on windows and I am new to python and lmfit. I find the lmfit documentation to be a bit obscure for a beginner and hope to find help here. Following is my code: I would like to generate a exponential-gaussian histogram, extract data points and practice fitting with lmfit
library. ( I would like to practice fitting and finding least number of points that would reproduce the parameters used producing the histogram)
from scipy.stats import exponnorm
from lmfit.models import ExponentialGaussianModel
K2 = 1.5
r2 = exponnorm.rvs(K2, size=500, loc = 205, scale = 40)
Q = np.histogram(r2,500)
exp_gaus_x = Q[1]
exp_gaus_y = Q[0]
tof_x = exp_gaus_x[1:]
tof_y = exp_gaus_y
mod = ExponentialGaussianModel()
pars = mod.guess(tof_y, x=tof_x)
out = mod.fit(tof_y, pars, x=tof_x)
(out.fit_report(min_correl=0.25))
I get the error that there are nan input values. I was expecting a report like shown in the manual.
Upvotes: 3
Views: 1178
Reputation: 7862
The definition of exponential Gaussian used in lmfit
is from https://en.wikipedia.org/wiki/Exponentially_modified_Gaussian_distribution.
With the exponential term being
exp[center*gamma + (gamma*sigma)**2/2 - gamma*x]
this has the tendency to give Inf
for large-ish values of sigma
and gamma
, and/or center
. I believe that you are getting such Inf
values and that this is the cause of the NaN
message you are seeing. The fitting routines (in Fortran) do not handle NaN
or Inf
gracefully (actually, "at all"). This is a real limitation of that particular model. You'll see that the examples on the wikipedia page all have x
values much closer to 1
than 200
, and gamma
and sigma
also on the order of 1, not around 50, which is what the automated guess
gives.
I think that a simpler definition for an exponentially modified Gaussian would be better for you. With
def expgaussian(x, amplitude=1, center=0, sigma=1.0, gamma=1.0):
""" an alternative exponentially modified Gaussian."""
dx = center-x
return amplitude* np.exp(gamma*dx) * erfc( dx/(np.sqrt(2)*sigma))
You'll get a decent fit, though the meaning of the parameters has changed a bit, and you will need to explicitly give starting values, not rely on a guess()
procedure. These don't have to be very close, just not very far off.
A full script might be:
import numpy as np
from scipy.stats import exponnorm
from scipy.special import erfc
from lmfit import Model
import matplotlib.pyplot as plt
def expgaussian(x, amplitude=1, center=0, sigma=1.0, gamma=1.0):
""" an alternative exponentially modified Gaussian."""
dx = center-x
return amplitude* np.exp(gamma*dx) * erfc( dx/(np.sqrt(2)*sigma))
K2 = 1.5
r2 = exponnorm.rvs(K2, size=500, loc = 205, scale = 40)
Q = np.histogram(r2,500)
exp_gaus_x = Q[1]
exp_gaus_y = Q[0]
tof_x = exp_gaus_x[1:]
tof_y = exp_gaus_y
mod = Model(expgaussian)
pars = mod.make_params(sigma=20, gamma=0.1, amplitude=2, center=250)
out = mod.fit(tof_y, pars, x=tof_x)
print(out.fit_report())
plt.plot(tof_x, tof_y, label='data')
plt.plot(tof_x, out.best_fit, label='fit')
plt.legend()
plt.show()
which will print out
[[Model]]
Model(expgaussian)
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 65
# data points = 500
# variables = 4
chi-square = 487.546692
reduced chi-square = 0.98295704
Akaike info crit = -4.61101662
Bayesian info crit = 12.2474158
[[Variables]]
gamma: 0.01664876 +/- 0.00239048 (14.36%) (init = 0.1)
sigma: 39.6914678 +/- 3.90960254 (9.85%) (init = 20)
center: 235.600396 +/- 11.8976560 (5.05%) (init = 250)
amplitude: 3.43975318 +/- 0.15675053 (4.56%) (init = 2)
[[Correlations]] (unreported correlations are < 0.100)
C(gamma, center) = 0.930
C(sigma, center) = 0.870
C(gamma, amplitude) = 0.712
C(gamma, sigma) = 0.693
C(center, amplitude) = 0.572
C(sigma, amplitude) = 0.285
and show a plot like
Hope that helps.
Upvotes: 3