Reputation: 21622
I use this code for smoothing data by fitting exponent with scipy.optimize.curve_fit
:
def smooth_data_v1(x_arr,y_arr):
def func(x, a, b, c):
return a*np.exp(-b*x)+c
#Scale data
y = y_orig / 10000.0
x = 500.0 * x_orig
popt, pcov = curve_fit(func, x, y, p0=(1, 0.01, 1))
y_smooth = func(x, *popt) # Calcaulate smoothed values for same points
#Undo scaling
y_final = y_smooth * 10000.0
return y_final
Howewer I want estimated exponent curve to go through 1st point.
Bad case:
Good case:
I have tried to remove last parameter using first point x0,y0:
def smooth_data_v2(x_orig,y_orig):
x0 = x_orig[0]
y0 = y_orig[0]
def func(x, a, b):
return a*np.exp(-b*x)+y0-a*np.exp(-b*x0)
#Scale data
y = y_orig / 10000.0
x = 500.0 * x_orig
popt, pcov = curve_fit(func, x, y, p0=(1, 0.01))
y_smooth = func(x, *popt) # Calcaulate smoothed values for same points
#Undo scaling
y_final = y_smooth * 10000.0
return y_final
Howewer something go wrong and I get:
a
param is really large
popt [ 4.45028144e+05 2.74698863e+01]
Any ideas?
Update:
Example of data
x_orig [ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.]
y_orig [ 445057. 447635. 450213. 425089. 391746. 350725. 285433. 269027.
243835. 230587. 216757. 202927. 189097. 175267. 161437.]
Upvotes: 3
Views: 4710
Reputation: 12410
Scipy curve_fit
allows for passing the parameter sigma
, which is designed to be the standard deviation for weighting the fit. But this array can be filled with arbitrary data:
from scipy.optimize import curve_fit
def smooth_data_v1(x_arr,y_arr):
def func(x, a, b, c):
return a*np.exp(-b*x)+c
#create the weighting array
y_weight = np.empty(len(y_arr))
#high pseudo-sd values, meaning less weighting in the fit
y_weight.fill(10)
#low values for point 0 and the last points, meaning more weighting during the fit procedure
y_weight[0] = y_weight[-5:-1] = 0.1
popt, pcov = curve_fit(func, x_arr, y_arr, p0=(y_arr[0], 1, 1), sigma = y_weight, absolute_sigma = True)
print("a, b, c:", *popt)
y_smooth = func(x_arr, *popt)
return y_smooth
x_orig = np.asarray([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
y_orig = np.asarray([ 445057, 447635, 450213, 425089, 391746, 350725, 285433, 269027,
243835, 230587, 216757, 202927, 189097, 175267, 161437])
print(smooth_data_v1(x_orig, y_orig))
As you can see, now the first and last point are close to the original values, but this "clamping of values" comes sometimes at a price for the rest of the data points.
You have probably also noticed, that I removed your rescaling part. Imho, one shouldn't do this before curve fitting procedures. It is usually better to use raw data. Additionally, your data are not really well represented by an exponential function, hence the tiny b
value.
Upvotes: 2
Reputation: 10759
How about changing the definition of the function you try to fit?
def func(x, a, b, c):
return (y_arr[0] - c)*np.exp(-b*(x - x_arr[0]))+c
This function always fits the first point perfectly by definition, but otherwise can do anything your current function does.
Upvotes: 0