David
David

Reputation: 51

scipy.optimize.curvefit fails when using bounds

I'm trying to fit a set of data with a function (see the example below) using scipy.optimize.curvefit, but when I use bounds (documentation) the fit fails and I simply get the initial guess parameters as output. As soon as I substitute -np.inf ad np.inf as bounds for the second parameter (dt in the function), the fit works. What am I doing wrong?

import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt

#Generate data
crc=np.array([-1.4e-14, 7.3e-14, 1.9e-13, 3.9e-13, 6.e-13, 8.0e-13, 9.2e-13, 9.9e-13, 
              1.e-12, 1.e-12, 1.e-12, 1.0e-12, 1.1e-12, 1.1e-12, 1.1e-12, 1.0e-12, 1.1e-12])
time=np.array([0., 368., 648., 960., 1520.,1864., 2248., 2655., 3031., 
                3384., 3688., 4048., 4680., 5343., 6055.,  6928.,  8120.])

#Define the function for the fit
def testcurve(x, Dp, dt):
    k = -Dp*(x+dt)*2e11
    curve = 1e-12 * (1+2*(-np.exp(k) + np.exp(4*k) - np.exp(9*k) + np.exp(16*k)))
    curve[0]= 0
    return curve

#Set fit bounds 
dtmax=time[2]
param_bounds = ((-np.inf, -dtmax),(np.inf, dtmax))

#Perform fit
(par, par_cov) = opt.curve_fit(testcurve, time, crc, p0 = (5e-15, 0), bounds = param_bounds)

#Print and plot output
print(par)       
plt.plot(time, crc, 'o')
plt.plot(time, testcurve(time, par[0], par[1]), 'r-')
plt.show()

Upvotes: 2

Views: 1445

Answers (1)

zhxywzy
zhxywzy

Reputation: 53

I encountered the same behavior today in a different fitting problem. After some searching online, I found this link quite helpful: Why does scipy.optimize.curve_fit not fit to the data?

The short answer is that: using extremely small (or large) numbers in numerical fitting is not robust and scale them leads to a much better fitting.


In your case, both crc and Dp are extremely small numbers which could be scaled up. You could play with the scale factors and within certain range the fitting looks quite robust. Full example:

import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt

#Generate data
crc=np.array([-1.4e-14, 7.3e-14, 1.9e-13, 3.9e-13, 6.e-13, 8.0e-13, 9.2e-13, 9.9e-13, 
              1.e-12, 1.e-12, 1.e-12, 1.0e-12, 1.1e-12, 1.1e-12, 1.1e-12, 1.0e-12, 1.1e-12])
time=np.array([0., 368., 648., 960., 1520.,1864., 2248., 2655., 3031., 
                3384., 3688., 4048., 4680., 5343., 6055.,  6928.,  8120.])

# add scale factors to the data as well as the fitting parameter
scale_factor_1 = 1e12 # 1./np.mean(crc) also works if you don't want to set the scale factor manually
scale_factor_2 = 1./2e11

#Define the function for the fit
def testcurve(x, Dp, dt):
    k = -Dp*(x+dt)*2e11 * scale_factor_2
    curve = 1e-12 * (1+2*(-np.exp(k) + np.exp(4*k) - np.exp(9*k) + np.exp(16*k))) * scale_factor_1
    curve[0]= 0
    return curve

#Set fit bounds 
dtmax=time[2]
param_bounds = ((-np.inf, -dtmax),(np.inf, dtmax))

#Perform fit
(par, par_cov) = opt.curve_fit(testcurve, time, crc*scale_factor_1, p0 = (5e-15/scale_factor_2, 0), bounds = param_bounds)

#Print and plot output
print(par[0]*scale_factor_2, par[1])

plt.plot(time, crc*scale_factor_1, 'o')
plt.plot(time, testcurve(time, par[0], par[1]), 'r-')
plt.show()

Fitting results: [6.273102923176595e-15, -21.12202697564494], which gives a reasonable fitting and also is very close to the result without any bounds: [6.27312512e-15, -2.11307470e+01]

Upvotes: 1

Related Questions