QhadriQas
QhadriQas

Reputation: 23

Why can't SciPy's curve_fit find the covariance/give me sensical parameters for this higher order gaussian function?

Here's some minimal code:

from scipy.optimize import curve_fit
xdata = [16.530468600170202, 16.86156794563677, 17.19266729110334, 17.523766636569913,
 17.854865982036483, 18.18596532750305, 18.51706467296962, 18.848164018436194, 19.179263363902763,
 19.510362709369332]
ydata = [394, 1121, 1173, 1196, 1140, 1196, 1158, 1160, 1046, 416]
#function i'm trying to fit the data to (higher order gaussian function)
def gauss(x, sig, n, c, x_o):
    return c*np.exp(-(x-x_o)**n/(2*sig**n))
popt = curve_fit(gauss, xdata, ydata)
#attempt at printing out parameters
print(popt)

When I try to execute the code, I get this error message:

ExCurveFit:9: RuntimeWarning: invalid value encountered in power
  return c*np.exp(-(x-x_o)**n/(2*sig**n))
C:\Users\dsneh\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\scipy\optimize\minpack.py:828: OptimizeWarning: Covariance of the parameters could not be estimated
  warnings.warn('Covariance of the parameters could not be estimated',
(array([nan, nan, nan, nan]), array([[inf, inf, inf, inf],
       [inf, inf, inf, inf],
       [inf, inf, inf, inf],
       [inf, inf, inf, inf]]))

I've seen that I should ignore the first one, but perhaps it is a problem. The second one is more concerning, and I obviously would like more sensical values for the parameters. I've tried adding parameters as guesses ([2,3,1000,17] for reference), but it did not help.

Upvotes: 2

Views: 446

Answers (1)

Alexander S. Brunmayr
Alexander S. Brunmayr

Reputation: 548

I believe you are running into this problem because curve_fit is also testing non-integer values of n, in which case your function gauss returns complex values when x<x_o.

I believe it would be easier to brute-force your way through every integer n and find the best parameters sig,c,x_o for each n. For example, if you consider n = 0,1,2,... up to perhaps 50, there are very few options for n, and brute forcing is actually a decent method. Looking at your data, it would be even better to only consider n as a multiple of 2 (unless your other data looks like n could be an odd integer).

Also, you should probably introduce some bounds for the other parameters, like it would be good to have sig>0, and you can safely set c>0 (that is, if all of your data looks like the minimal data you included in your question).

Here is what I did:

p0 = [1,1200,18] # initial guess for sig,c,x_o
bounds = ([0.01,100,10],[1000,2500,200]) # (lower,upper) bounds for sig,c,x_o
min_err = np.inf # error to minimize to find the best value for n

for n in range(0,41,2): # n = 0,2,4,6,...,40

    def gauss(x, sig, c, x_o, n=n):
        return c*np.exp(-(x-x_o)**n/(2*sig**n))
    
    popt,pcov = curve_fit(gauss, xdata, ydata, p0=p0, bounds=bounds)

    err = np.sqrt(np.diag(pcov)).sum()
    if err < min_err:
        min_err, best_n, best_popt, best_pcov = err, n, popt, pcov

print(min_error, best_n, best_popt, best_pcov, sep='\n')

import matplotlib.pyplot as plt
plt.plot(xdata,ydata)
plt.plot(xdata, gauss(xdata, *best_popt, n=best_n))
plt.show()

I get best_n = 10 and best_popt = [1.38, 1173.52, 18.02].

Here is the plot of the result: (the blue line is the data, and the orange line is the gauss fit)

best fit

Upvotes: 2

Related Questions