twistfire
twistfire

Reputation: 476

using scipy.optimize curve_fit to find parameters of a curve and getting 'Covariance of the parameters could not be estimated'

I am trying to use scipy.optimize to fit experimental data and got:

optimizeWarning: Covariance of the parameters could not be estimated
  warnings.warn('Covariance of the parameters could not be estimated',

Here is the data I am trying to fit with exponential curve:

enter image description here

here is the part of a code where I am trying to fit a data:

# using curve_fit
from scipy.optimize import curve_fit

# defining a function

# exponential curve
def _1_func(x,  a0,b0,beta):
    """
    calculates the exponential curve shifted by bo and scaled by a0
    beta is exponential 
    """
    y = a0 * np.exp( beta * x ) + b0
    return y

# the code to fit
# initial guess for exp fitting params

numpoints = spectrum_one.shape[0]

x = F[1:numpoints] # zero element is not used
y = np.absolute(spectrum_one[1:numpoints])/signal_size

# making an initial guess
a0 = 1
b0 = y.mean()
beta = -100  

p0 = [a0, b0, beta]

popt, pcov = curve_fit(_1_func, x, y, p0=p0)
perr = np.sqrt(np.diag(pcov)) # errors

print('Popt')
print(popt)
print('Pcov')
print(pcov)

UPDATE1: The result is:

Popt
[ 1.00000000e+00  7.80761109e-04 -1.00000000e+02]
Pcov
[[inf inf inf]
 [inf inf inf]
 [inf inf inf]]

UPDATE 2 - raw data for fitting is here in csv format: https://drive.google.com/file/d/1wUoS3Dq_3XdZwo3OMth4_PT-1xVJXdfy/view?usp=share_link

As I understand ic pcov has inf - it shows that curve_fit can't calculate the covariance and the popt parameters can't be used they are not optimal for this data..

If I visualize the data I have next results:

enter image description here

Why am I getting this type of error? (I thought it is an easy task for curve_fit)

Maybe I need to scale my data somehow?

Upvotes: 0

Views: 1563

Answers (1)

adrianop01
adrianop01

Reputation: 600

Solution

Scaling up x before fitting the data to the exponential function solves this problem, or scaling down the starting beta value (refer to explanations for discussion).

# using curve_fit
from scipy.optimize import curve_fit
import numpy as np

spectrum_one = np.genfromtxt('spectrum_one.csv', delimiter=',', skip_header = 1)

# defining a function
# you may also just change the starting beta value and get the same effects.
# x_scale = 1
x_scale = 10000000
y_scale = 1

# exponential curve
def _1_func(x,  a0,b0,beta):
    """
    calculates the exponential curve shifted by bo and scaled by a0
    beta is exponential 
    """
    y = (a0 * np.exp( beta * (x/x_scale) ) + b0)*y_scale
    return y

# the code to fit
# initial guess for exp fitting params

x = spectrum_one[:,0] # zero element is not used
y = np.absolute(spectrum_one[:,1])

# making an initial guess
a0 = 1
b0 = y.mean()
beta = -100 
# Scaling beta have same effect:
# beta = -100/10000000 

p0 = [a0, b0, beta]

popt, pcov = curve_fit(_1_func, x, y, p0=p0)
perr = np.sqrt(np.diag(pcov)) # errors

print('Popt')
print(popt)
print('Pcov')
print(pcov)

Output of the covariance matrix and estimated coefficients are as follows:

Popt
[ 2.22681758e-03  6.16059861e-04 -2.69660695e+01]
Pcov
[[ 4.41126691e-09  1.26838432e-12 -5.34773724e-05]
 [ 1.26838432e-12  1.15025798e-10 -5.58666885e-06]
 [-5.34773724e-05 -5.58666885e-06  1.55785623e+00]]

Note that the above function can be directly used for curve fitting of original, unscaled data (x is unmodified frequencies):

plt.plot(x,y)
plt.plot(x,_1_func(x,*popt))

enter image description here

Explanation

The reason why the least square algorithm (Levenberg–Marquardt algorithm) fails to converge under the original starting condition is likely because of the Jacobian J at the starting point being essentially a zero vector numerically:

enter image description here If we calculate the Jacobian with respect to coefficient vector β:

enter image description here

If x is in the original scale with your original beta=-100, the exponential components in the Jacobian will be very close to zero numerically. Thus, either the LM algorithm does not give a sensible value of coefficients increments δ = Δβ due to numerical stability issues (dividing by very small floating point values), or simply fails to converge because the Jacobian is essentially zero (refer to the above equation).

In fact, if you try a very small beta = -100/10000000 at the beginning (such that the numerical value of exponential is not close to zero), without the data scaling AT ALL x_scale = y_scale = 1, the algorithm still converges:

Popt
[ 2.22681758e-03  6.16059861e-04 -2.69660695e-06]
Pcov
[[ 4.41126692e-09  1.26838469e-12 -5.34773723e-12]
 [ 1.26838469e-12  1.15025798e-10 -5.58666883e-13]
 [-5.34773723e-12 -5.58666883e-13  1.55785621e-14]]

enter image description here

You may want to refer to this answer for further discussions: scipy curve_fit raises "OptimizeWarning: Covariance of the parameters could not be estimated"

Upvotes: 5

Related Questions