Reputation: 151
I have a small spectrum peak and I'm trying to fit a Gaussian function to it. I searched an example online and mixed the code with the one I made.
wveleng=[ 639.188 639.454 639.719 639.985 640.25 640.516 640.781 641.046
641.312 641.577]
counts=[ 778. 1613.8 12977.4 32990. 33165.2 13171. 2067.2 900.8
788.8 747.8]
my first code is the following
def gaus(x,a,mu,sigma):
return a*exp(-(x-mu)**2/(2*sigma**2))
a=ydata.max()
x0=ydata.mean()
sigm=ydata.std()
mean = sum(ydata*xdata)/len(ydata)
sigma = np.sqrt(sum(ydata*(xdata-mean)**2)/len(ydata))
#print(ydata.max())
popt, pcov = curve_fit(Gauss, xdata,ydata,maxfev=991,p0=[a,x0,sigm])
#gmodel = Model(Gauss)
#result = gmodel.fit(ydata, x=xdata, a=ydata.max(),x0=ydata.mean(),sigm=ydata.std())
print(popt)
#plt.scatter(xdata,ydata,label='data points')
#plt.plot(xdata, result.best_fit, 'r-')
#popt, pcov = curve_fit(gauss, xdata, ydata,p0=[ydata.max(), ydata.mean(), ydata.std()])
xx = np.linspace(639,642, 10)
plt.plot(xx, gauss(xdata, *popt), 'r-', label='fit')
with plot i get the following.
i think it has to do with initial guess parameters
a second code which i find more compact and suits better for me.
def gauss(x, a, x0, sigma):
return a * np.exp(-(x - x0) ** 2 / (2 * sigma ** 2))
ydata = np.array([778.,1613.8,12977.4,32990.,33165.2,13171.,2067.2,900.8,788.8,747.8])
xx = np.arange(639,642, 100)
xdata=np.array([639.188,639.454,639.719,639.985,640.250,640.516,640.781,641.046,641.312,641.577])
#plt.plot(xdata, ydata, 'bo', label='data')
def Gauss(x, a, x0, sigm):
return a * np.exp(-(x - x0)**2 / (2 * sigm**2))
gmodel = Model(Gauss)
result = gmodel.fit(ydata, x=xdata, a=ydata.max(),x0=ydata.mean(),sigm=ydata.std())
plt.scatter(xdata,ydata,label='data points')
plt.plot(xdata, result.best_fit, 'r-')
i get exactly the same fit as the 1st method. Is there a way fit more points in than the data itself
Upvotes: 1
Views: 3277
Reputation: 7862
I think you are really close, though I have to admit I do not understand what xx
is meant to represent. You definitely want the data to be fit (ydata
) and the independent variable (xdata
) to be the same length.
I think the main problem you are running into now is that your initial guesses are not very good, and that you will get good results with
result = gmodel.fit(ydata, x=xdata, a=ydata.max(), x0=xdata.mean(), sigma=xdata.std())
(with xdata
instead of ydata
controlling the initial values for x0
and sigma
).
Perhaps even better would be to add some sanity checks to the range of parameter values, as with
params = gmodel.make_params(a=ydata.max(),
x0=xdata.mean(),
sigma=xdata.std())
params['x0'].min = min(xdata)
params['x0'].max = max(xdata)
params['sigma'].max = 5
result = gmodel.fit(ydata, params, x=xdata)
Finally, using the built-in models like GaussianModel will report both sigma
and fwhm
and also amplitude
(that is, integral of the Gaussian) and height
(the maximum value the Gaussian would take). So, this script:
import numpy as np
from lmfit.models import GaussianModel
import matplotlib.pyplot as plt
ydata = np.array([778.,1613.8,12977.4,32990.,33165.2,13171.,2067.2,900.8,788.8,747.8])
xdata = np.array([639.188,639.454,639.719,639.985,640.250,640.516,640.781,641.046,641.312,641.577])
gmodel = GaussianModel()
params = gmodel.make_params(amplitude=ydata.max(),
center=xdata.mean(),
sigma=xdata.std())
result = gmodel.fit(ydata, params, x=xdata)
print(result.fit_report())
plt.scatter(xdata,ydata,label='data points')
plt.plot(xdata, result.best_fit, 'r-')
plt.show()
will print out
[[Model]]
Model(gaussian)
[[Fit Statistics]]
# function evals = 27
# data points = 10
# variables = 3
chi-square = 2360971.771
reduced chi-square = 337281.682
Akaike info crit = 129.720
Bayesian info crit = 130.628
[[Variables]]
sigma: 0.27525232 +/- 0.004505 (1.64%) (init= 0.7623906)
center: 640.119396 +/- 0.004490 (0.00%) (init= 640.3828)
amplitude: 25633.2702 +/- 362.5571 (1.41%) (init= 33165.2)
fwhm: 0.64816968 +/- 0.010608 (1.64%) == '2.3548200*sigma'
height: 37152.0777 +/- 525.0717 (1.41%) == '0.3989423*amplitude/max(1.e-15, sigma)'
[[Correlations]] (unreported correlations are < 0.100)
C(sigma, amplitude) = 0.579
and give a pretty good-looking fit. For advanced practice, I recommend trying to add a ConstantModel() to give a background offset. Well, and collect more data points ;).
Upvotes: 1
Reputation: 7862
scipy.integrate.quad is not doing convolution, as you seem to expect. quad(function, lower_bound, upper_bound)[0]
will return a single value for the integral of the function between the bounds.
OTOH, curve_fit(func, ...)
needs an array of values for the model function, and it is complaining that it got a float, not an ndarray.
Maybe you intended to do curve_fit(vfunc, ...)
?
You might find lmfit (https://lmfit.github.io/lmfit-py/) useful. It has convenient, high-level tools curve-fitting, and simple model functions like Gaussian are built in. It also has a mechanism for fitting a model that is the sum or product of two function, and can even create a model that consists of the convolution of two functions. For example of this, see the examples described in https://lmfit.github.io/lmfit-py/model.html#composite-models-adding-or-multiplying-models.
Upvotes: 1