Reputation: 51
I'm trying to fit a set of data with a function (see the example below) using scipy.optimize.curvefit
,
but when I use bounds (documentation) the fit fails and I simply get
the initial guess parameters as output.
As soon as I substitute -np.inf
ad np.inf
as bounds for the second parameter
(dt
in the function), the fit works.
What am I doing wrong?
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt
#Generate data
crc=np.array([-1.4e-14, 7.3e-14, 1.9e-13, 3.9e-13, 6.e-13, 8.0e-13, 9.2e-13, 9.9e-13,
1.e-12, 1.e-12, 1.e-12, 1.0e-12, 1.1e-12, 1.1e-12, 1.1e-12, 1.0e-12, 1.1e-12])
time=np.array([0., 368., 648., 960., 1520.,1864., 2248., 2655., 3031.,
3384., 3688., 4048., 4680., 5343., 6055., 6928., 8120.])
#Define the function for the fit
def testcurve(x, Dp, dt):
k = -Dp*(x+dt)*2e11
curve = 1e-12 * (1+2*(-np.exp(k) + np.exp(4*k) - np.exp(9*k) + np.exp(16*k)))
curve[0]= 0
return curve
#Set fit bounds
dtmax=time[2]
param_bounds = ((-np.inf, -dtmax),(np.inf, dtmax))
#Perform fit
(par, par_cov) = opt.curve_fit(testcurve, time, crc, p0 = (5e-15, 0), bounds = param_bounds)
#Print and plot output
print(par)
plt.plot(time, crc, 'o')
plt.plot(time, testcurve(time, par[0], par[1]), 'r-')
plt.show()
Upvotes: 2
Views: 1445
Reputation: 53
I encountered the same behavior today in a different fitting problem. After some searching online, I found this link quite helpful: Why does scipy.optimize.curve_fit not fit to the data?
The short answer is that: using extremely small (or large) numbers in numerical fitting is not robust and scale them leads to a much better fitting.
In your case, both crc
and Dp
are extremely small numbers which could be scaled up. You could play with the scale factors and within certain range the fitting looks quite robust. Full example:
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt
#Generate data
crc=np.array([-1.4e-14, 7.3e-14, 1.9e-13, 3.9e-13, 6.e-13, 8.0e-13, 9.2e-13, 9.9e-13,
1.e-12, 1.e-12, 1.e-12, 1.0e-12, 1.1e-12, 1.1e-12, 1.1e-12, 1.0e-12, 1.1e-12])
time=np.array([0., 368., 648., 960., 1520.,1864., 2248., 2655., 3031.,
3384., 3688., 4048., 4680., 5343., 6055., 6928., 8120.])
# add scale factors to the data as well as the fitting parameter
scale_factor_1 = 1e12 # 1./np.mean(crc) also works if you don't want to set the scale factor manually
scale_factor_2 = 1./2e11
#Define the function for the fit
def testcurve(x, Dp, dt):
k = -Dp*(x+dt)*2e11 * scale_factor_2
curve = 1e-12 * (1+2*(-np.exp(k) + np.exp(4*k) - np.exp(9*k) + np.exp(16*k))) * scale_factor_1
curve[0]= 0
return curve
#Set fit bounds
dtmax=time[2]
param_bounds = ((-np.inf, -dtmax),(np.inf, dtmax))
#Perform fit
(par, par_cov) = opt.curve_fit(testcurve, time, crc*scale_factor_1, p0 = (5e-15/scale_factor_2, 0), bounds = param_bounds)
#Print and plot output
print(par[0]*scale_factor_2, par[1])
plt.plot(time, crc*scale_factor_1, 'o')
plt.plot(time, testcurve(time, par[0], par[1]), 'r-')
plt.show()
Fitting results: [6.273102923176595e-15, -21.12202697564494]
, which gives a reasonable fitting and also is very close to the result without any bounds: [6.27312512e-15, -2.11307470e+01]
Upvotes: 1