Reputation: 4292
Lets say I have a model f
which is parametrized by t
. I want the optimal value for t
such that ∑ₓ (f(x, t) - y(x))²
is minimized. This is what least squares optimization is for.
In the following example
from numpy import *
from scipy.optimize import curve_fit
x = arange(100)
t_true = 30
y = 1. / (1 + exp(-(x - t_true) / 5.))
f = lambda x, t: [0. if xi < t else 1. for xi in x]
t_opt, t_cor = curve_fit(f, x, y, p0=(20.))
plot(x, y)
plot(x, f(x, t_opt))
print(t_cor)
why do I get t_opt=20
and not something close to t_opt=30
?
Also, why is t_cor=inf
? The result I get is:
where blue is the data and green is the fitted model, but I expect something that looks like this:
I do expect this because the sum of squared residuals from the second image is sure smaller than it is for the first image and there obviously is no local minimum where the optimization could possibly stuck. So why isn't that working?
Upvotes: 0
Views: 4120
Reputation: 58865
curve_fit
is a wrapper around least_sq
that uses the following error function:
def error(params, x, y):
return np.sum((func(x, params) - y)**2)
In your question curve_fit
did not work because the equation you are trying to fit is very different from the equation you used to generate y
.
The recommended function to fit in this case would be (with t
as unknown):
def f(x, t):
return 1. / (1 + exp(-(x - t) / 5.))
With this recommended fitting function, curve_fit
will work, or you can use scipy.optimize.leastsq
directly like:
import numpy as np
from numpy import exp
from scipy.optimize import leastsq, curve_fit
x = np.arange(100)
t_true = 30
def f(x, t):
return 1. / (1 + exp(-(x - t) / 5.))
y = f(x, t_true)
def error(t, x, y):
return np.sum((f(x, t) - y)**2)
t_opt, t_cor = leastsq(error, args=(x, y), x0=1.)
print('least_sq', t_opt, t_cor)
t_opt2, t_cor2 = curve_fit(f, x, y, p0=(0,))
print('curve_fit', t_opt2, t_cor2)
Which will give you:
least_sq [ 30.00000007] 2
curve_fit [ 30.] [[ 0.]]
Upvotes: 1