Reputation: 110
I have a custom function that takes 6 parameters. It creates a step "curve".
I want to optimize/fit the last 5 parameters (m1, m2, FL1, FL2 and FL3).
def steps(x, FL1, m1, FL2, m2, FL3):
if x > m1:
return FL1
if x > m2:
return FL2
else:
return FL3
I want to fit this function to always be below my data points.
Example:
The real data looks slightly more complex, like this, but still not complex:
What functionality can I use here to fit this function?
I assume scipy.optimize.curve_fit
does not work due to least_squares
is used. That way it will be centered around the data points.
from scipy.optimize import curve_fit
def steps(m, FL1, m1, FL2, m2, FL3):
if m > m1:
return FL1
if mass > m2:
return FL2
else:
return FL3
x_data = [8000, 7000, 6000, 5000]
y_data = [300, 350, 400, 450]
popt, _ = curve_fit(steps, x_data, y_data)
This fails:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
It does not seem to like the if
statements in the function.
Upvotes: 1
Views: 141
Reputation: 37
I agree with Mikael's answer that you should try to make your expression as concise as possible. I have an example of your real data shape, that combines linear functions and a constant function.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
# Define step function
def step_func(x, a, c, f1, f2):
result = []
for xi in x:
if xi >= 0 and xi < f1:
result.append(a * xi)
elif xi >= f1 and xi < f2:
result.append(a * f1)
elif xi >= f2:
result.append(c * xi + a * f1 - f2 * c)
else:
result.append(0)
return np.asarray(result)
# Generating some demo data
x_data = np.linspace(0, 10, 100)
y_data = step_func(x_data, 2, 1, 3, 7) + np.random.normal(0, 0.5, x_data.size) # 添加噪声
# Fitting function
def fitting_func(x, a, c, f1, f2):
return step_func(x, a, c, f1, f2)
# Fitting for data
popt, pcov = curve_fit(fitting_func, x_data, y_data, p0=[1, -1, 4, 6])
# Obtain fitting parameters
a_fit, c_fit, f1_fit, f2_fit = popt
print(f'Fitting param: a={a_fit}, c={c_fit}, f1={f1_fit}, f2={f2_fit}')
# Generat fitting results
y_fit = fitting_func(x_data, *popt)
# Plot
plt.scatter(x_data, y_data, label='Data', s=10)
plt.plot(x_data, y_fit, color='red', label='Fitted')
plt.vlines([f1_fit, f2_fit], ymin = min(x_data), ymax = max(x_data) )
# plt.xlim(0,16)
# plt.ylim(0,1)
plt.legend()
plt.xlabel('x')
plt.ylabel('y')
plt.title('Step func fitting')
plt.show()
The above returns:
Fitting param: a=1.9532567200579396, c=1.085588494015425, f1=3.084102835100397, f2=7.1628504872999095
Upvotes: 0
Reputation: 2375
Your optimization only has 2 parameters to optimize; m1
, m2
. The height of the step follows from your interpolated data. For example, FL1 == y_data[0]
always.
You'll need to express this in your function, and you could use it in curve_fit
. It also expects you to vectorize your function, but i just couldn't be bothered.
If we express this function
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import numpy as np
x_data = np.linspace(8000, 5000, 10)
y_data = np.array([400, 420, 440, 450, 453, 454, 452, 453, 453, 452])
def limit(m):
return np.interp(m, np.flip(x_data), np.flip(y_data))
def constrained_steps(m, m1, m2):
results = list() # Could be vectorized if you cared enough.
for i in range(len(m)):
if m[i] > m1:
results.append(y_data[0])
elif m[i] > m2:
results.append(limit(m1))
else:
results.append(limit(m2))
return np.array(results)
# Initialize initial guess at 2/3 and 1/3's through the x_data.
bounds = x_data[[-1,0]]
p0 = ((bounds[1]-bounds[0])*2/3 + bounds[0], (bounds[1]-bounds[0])*1/3 + bounds[0])
popt, _ = curve_fit(constrained_steps, x_data, y_data, p0=p0, bounds=bounds)
opt_m1, opt_m2 = popt
ms = np.linspace(bounds[0], bounds[1], 1000)
plt.scatter(x_data, y_data)
plt.plot(ms, constrained_steps(ms, opt_m1, opt_m2))
plt.gca().invert_xaxis()
plt.show()
Upvotes: 1