Fit a Gaussian to measured peak

Question

I have a small spectrum peak and I'm trying to fit a Gaussian function to it. I searched an example online and mixed the code with the one I made.

wveleng=[ 639.188  639.454  639.719  639.985  640.25   640.516  640.781  641.046
      641.312  641.577]
    counts=[   778.    1613.8  12977.4  32990.   33165.2  13171.    2067.2    900.8
        788.8    747.8]

my first code is the following

def gaus(x,a,mu,sigma):
    return a*exp(-(x-mu)**2/(2*sigma**2))

    a=ydata.max()
x0=ydata.mean()
sigm=ydata.std()

mean = sum(ydata*xdata)/len(ydata)
sigma = np.sqrt(sum(ydata*(xdata-mean)**2)/len(ydata))

#print(ydata.max())
popt, pcov = curve_fit(Gauss, xdata,ydata,maxfev=991,p0=[a,x0,sigm])    
#gmodel = Model(Gauss)
#result = gmodel.fit(ydata, x=xdata, a=ydata.max(),x0=ydata.mean(),sigm=ydata.std())
print(popt)
#plt.scatter(xdata,ydata,label='data points')
#plt.plot(xdata, result.best_fit, 'r-')
#popt, pcov = curve_fit(gauss, xdata, ydata,p0=[ydata.max(), ydata.mean(), ydata.std()])
xx = np.linspace(639,642, 10)
plt.plot(xx, gauss(xdata, *popt), 'r-', label='fit')

with plot i get the following.

i think it has to do with initial guess parameters

a second code which i find more compact and suits better for me.

    def gauss(x, a, x0, sigma):
    return a * np.exp(-(x - x0) ** 2 / (2 * sigma ** 2))

ydata = np.array([778.,1613.8,12977.4,32990.,33165.2,13171.,2067.2,900.8,788.8,747.8])

xx = np.arange(639,642, 100)
xdata=np.array([639.188,639.454,639.719,639.985,640.250,640.516,640.781,641.046,641.312,641.577])


#plt.plot(xdata, ydata, 'bo', label='data')
def Gauss(x, a, x0, sigm):
    return a * np.exp(-(x - x0)**2 / (2 * sigm**2))

gmodel = Model(Gauss)
result = gmodel.fit(ydata, x=xdata, a=ydata.max(),x0=ydata.mean(),sigm=ydata.std())
plt.scatter(xdata,ydata,label='data points')
plt.plot(xdata, result.best_fit, 'r-')

i get exactly the same fit as the 1st method. Is there a way fit more points in than the data itself

M Newville · Accepted Answer

I think you are really close, though I have to admit I do not understand what xx is meant to represent. You definitely want the data to be fit (ydata) and the independent variable (xdata) to be the same length.

I think the main problem you are running into now is that your initial guesses are not very good, and that you will get good results with

result = gmodel.fit(ydata, x=xdata, a=ydata.max(), x0=xdata.mean(), sigma=xdata.std())

(with xdata instead of ydata controlling the initial values for x0 and sigma).

Perhaps even better would be to add some sanity checks to the range of parameter values, as with

params = gmodel.make_params(a=ydata.max(),
                            x0=xdata.mean(),
                            sigma=xdata.std())
params['x0'].min = min(xdata)
params['x0'].max = max(xdata)
params['sigma'].max = 5
result = gmodel.fit(ydata, params, x=xdata)

Finally, using the built-in models like GaussianModel will report both sigma and fwhm and also amplitude (that is, integral of the Gaussian) and height (the maximum value the Gaussian would take). So, this script:

import numpy as np
from lmfit.models import GaussianModel
import matplotlib.pyplot as plt

ydata = np.array([778.,1613.8,12977.4,32990.,33165.2,13171.,2067.2,900.8,788.8,747.8])
xdata = np.array([639.188,639.454,639.719,639.985,640.250,640.516,640.781,641.046,641.312,641.577])

gmodel = GaussianModel()
params = gmodel.make_params(amplitude=ydata.max(),
                            center=xdata.mean(),
                            sigma=xdata.std())
result = gmodel.fit(ydata, params, x=xdata)

print(result.fit_report())
plt.scatter(xdata,ydata,label='data points')
plt.plot(xdata, result.best_fit, 'r-')
plt.show()

will print out

[[Model]]
    Model(gaussian)
[[Fit Statistics]]
    # function evals   = 27
    # data points      = 10
    # variables        = 3
    chi-square         = 2360971.771
    reduced chi-square = 337281.682
    Akaike info crit   = 129.720
    Bayesian info crit = 130.628
[[Variables]]
    sigma:       0.27525232 +/- 0.004505 (1.64%) (init= 0.7623906)
    center:      640.119396 +/- 0.004490 (0.00%) (init= 640.3828)
    amplitude:   25633.2702 +/- 362.5571 (1.41%) (init= 33165.2)
    fwhm:        0.64816968 +/- 0.010608 (1.64%)  == '2.3548200*sigma'
    height:      37152.0777 +/- 525.0717 (1.41%)  == '0.3989423*amplitude/max(1.e-15, sigma)'
[[Correlations]] (unreported correlations are <  0.100)
    C(sigma, amplitude)          =  0.579

and give a pretty good-looking fit. For advanced practice, I recommend trying to add a ConstantModel() to give a background offset. Well, and collect more data points ;).

Fit a Gaussian to measured peak

Answers (2)

Related Questions