kostrykin
kostrykin

Reputation: 4292

Python: curve_fit for least squares minimization

Lets say I have a model f which is parametrized by t. I want the optimal value for t such that ∑ₓ (f(x, t) - y(x))² is minimized. This is what least squares optimization is for.

In the following example

from numpy import *
from scipy.optimize import curve_fit

x = arange(100)

t_true = 30
y = 1. / (1 + exp(-(x - t_true) / 5.))

f = lambda x, t: [0. if xi < t else 1. for xi in x]
t_opt, t_cor = curve_fit(f, x, y, p0=(20.))

plot(x, y)
plot(x, f(x, t_opt))

print(t_cor)

why do I get t_opt=20 and not something close to t_opt=30?

Also, why is t_cor=inf? The result I get is:

enter image description here

where blue is the data and green is the fitted model, but I expect something that looks like this:

enter image description here

I do expect this because the sum of squared residuals from the second image is sure smaller than it is for the first image and there obviously is no local minimum where the optimization could possibly stuck. So why isn't that working?

Upvotes: 0

Views: 4120

Answers (1)

Saullo G. P. Castro
Saullo G. P. Castro

Reputation: 58865

curve_fit is a wrapper around least_sq that uses the following error function:

 def error(params, x, y):
    return np.sum((func(x, params) - y)**2)

In your question curve_fit did not work because the equation you are trying to fit is very different from the equation you used to generate y.

The recommended function to fit in this case would be (with t as unknown):

def f(x, t):
    return 1. / (1 + exp(-(x - t) / 5.))

With this recommended fitting function, curve_fit will work, or you can use scipy.optimize.leastsq directly like:

import numpy as np
from numpy import exp
from scipy.optimize import leastsq, curve_fit

x = np.arange(100)

t_true = 30

def f(x, t):
    return 1. / (1 + exp(-(x - t) / 5.))

y = f(x, t_true)

def error(t, x, y):
    return np.sum((f(x, t) - y)**2)

t_opt, t_cor = leastsq(error, args=(x, y), x0=1.)
print('least_sq', t_opt, t_cor)

t_opt2, t_cor2 = curve_fit(f, x, y, p0=(0,))
print('curve_fit', t_opt2, t_cor2)

Which will give you:

least_sq [ 30.00000007] 2
curve_fit [ 30.] [[ 0.]]

Upvotes: 1

Related Questions