kevin
kevin

Reputation: 151

Fit a Gaussian to measured peak

I have a small spectrum peak and I'm trying to fit a Gaussian function to it. I searched an example online and mixed the code with the one I made.

wveleng=[ 639.188  639.454  639.719  639.985  640.25   640.516  640.781  641.046
      641.312  641.577]
    counts=[   778.    1613.8  12977.4  32990.   33165.2  13171.    2067.2    900.8
        788.8    747.8]

my first code is the following

def gaus(x,a,mu,sigma):
    return a*exp(-(x-mu)**2/(2*sigma**2))

    a=ydata.max()
x0=ydata.mean()
sigm=ydata.std()

mean = sum(ydata*xdata)/len(ydata)
sigma = np.sqrt(sum(ydata*(xdata-mean)**2)/len(ydata))

#print(ydata.max())
popt, pcov = curve_fit(Gauss, xdata,ydata,maxfev=991,p0=[a,x0,sigm])    
#gmodel = Model(Gauss)
#result = gmodel.fit(ydata, x=xdata, a=ydata.max(),x0=ydata.mean(),sigm=ydata.std())
print(popt)
#plt.scatter(xdata,ydata,label='data points')
#plt.plot(xdata, result.best_fit, 'r-')
#popt, pcov = curve_fit(gauss, xdata, ydata,p0=[ydata.max(), ydata.mean(), ydata.std()])
xx = np.linspace(639,642, 10)
plt.plot(xx, gauss(xdata, *popt), 'r-', label='fit')

with plot i get the following.

enter image description here

i think it has to do with initial guess parameters

a second code which i find more compact and suits better for me.

    def gauss(x, a, x0, sigma):
    return a * np.exp(-(x - x0) ** 2 / (2 * sigma ** 2))

ydata = np.array([778.,1613.8,12977.4,32990.,33165.2,13171.,2067.2,900.8,788.8,747.8])

xx = np.arange(639,642, 100)
xdata=np.array([639.188,639.454,639.719,639.985,640.250,640.516,640.781,641.046,641.312,641.577])


#plt.plot(xdata, ydata, 'bo', label='data')
def Gauss(x, a, x0, sigm):
    return a * np.exp(-(x - x0)**2 / (2 * sigm**2))

gmodel = Model(Gauss)
result = gmodel.fit(ydata, x=xdata, a=ydata.max(),x0=ydata.mean(),sigm=ydata.std())
plt.scatter(xdata,ydata,label='data points')
plt.plot(xdata, result.best_fit, 'r-')

i get exactly the same fit as the 1st method. Is there a way fit more points in than the data itself

Upvotes: 1

Views: 3277

Answers (2)

M Newville
M Newville

Reputation: 7862

I think you are really close, though I have to admit I do not understand what xx is meant to represent. You definitely want the data to be fit (ydata) and the independent variable (xdata) to be the same length.

I think the main problem you are running into now is that your initial guesses are not very good, and that you will get good results with

result = gmodel.fit(ydata, x=xdata, a=ydata.max(), x0=xdata.mean(), sigma=xdata.std())

(with xdata instead of ydata controlling the initial values for x0 and sigma).

Perhaps even better would be to add some sanity checks to the range of parameter values, as with

params = gmodel.make_params(a=ydata.max(),
                            x0=xdata.mean(),
                            sigma=xdata.std())
params['x0'].min = min(xdata)
params['x0'].max = max(xdata)
params['sigma'].max = 5
result = gmodel.fit(ydata, params, x=xdata)

Finally, using the built-in models like GaussianModel will report both sigma and fwhm and also amplitude (that is, integral of the Gaussian) and height (the maximum value the Gaussian would take). So, this script:

import numpy as np
from lmfit.models import GaussianModel
import matplotlib.pyplot as plt

ydata = np.array([778.,1613.8,12977.4,32990.,33165.2,13171.,2067.2,900.8,788.8,747.8])
xdata = np.array([639.188,639.454,639.719,639.985,640.250,640.516,640.781,641.046,641.312,641.577])

gmodel = GaussianModel()
params = gmodel.make_params(amplitude=ydata.max(),
                            center=xdata.mean(),
                            sigma=xdata.std())
result = gmodel.fit(ydata, params, x=xdata)

print(result.fit_report())
plt.scatter(xdata,ydata,label='data points')
plt.plot(xdata, result.best_fit, 'r-')
plt.show()

will print out

[[Model]]
    Model(gaussian)
[[Fit Statistics]]
    # function evals   = 27
    # data points      = 10
    # variables        = 3
    chi-square         = 2360971.771
    reduced chi-square = 337281.682
    Akaike info crit   = 129.720
    Bayesian info crit = 130.628
[[Variables]]
    sigma:       0.27525232 +/- 0.004505 (1.64%) (init= 0.7623906)
    center:      640.119396 +/- 0.004490 (0.00%) (init= 640.3828)
    amplitude:   25633.2702 +/- 362.5571 (1.41%) (init= 33165.2)
    fwhm:        0.64816968 +/- 0.010608 (1.64%)  == '2.3548200*sigma'
    height:      37152.0777 +/- 525.0717 (1.41%)  == '0.3989423*amplitude/max(1.e-15, sigma)'
[[Correlations]] (unreported correlations are <  0.100)
    C(sigma, amplitude)          =  0.579 

and give a pretty good-looking fit. For advanced practice, I recommend trying to add a ConstantModel() to give a background offset. Well, and collect more data points ;).

Upvotes: 1

M Newville
M Newville

Reputation: 7862

scipy.integrate.quad is not doing convolution, as you seem to expect. quad(function, lower_bound, upper_bound)[0] will return a single value for the integral of the function between the bounds.

OTOH, curve_fit(func, ...) needs an array of values for the model function, and it is complaining that it got a float, not an ndarray.

Maybe you intended to do curve_fit(vfunc, ...)?

You might find lmfit (https://lmfit.github.io/lmfit-py/) useful. It has convenient, high-level tools curve-fitting, and simple model functions like Gaussian are built in. It also has a mechanism for fitting a model that is the sum or product of two function, and can even create a model that consists of the convolution of two functions. For example of this, see the examples described in https://lmfit.github.io/lmfit-py/model.html#composite-models-adding-or-multiplying-models.

Upvotes: 1

Related Questions