Tom Kealy
Tom Kealy

Reputation: 2669

Strange behaviour in scipy.optimize.leastsq

I've noticed some unusual behavior when trying to fit some noisy data: when I change the length of the array, I get wildly differing fits.

import numpy as np
import matplotlib.pyplot as plt
# set up true function and "measured" data
x = np.linspace(0, 6e-2, 500);
A, k, theta = 10, 1.0 / 3e-2, np.pi / 6;
y_true = A * np.sin(2 * np.pi * k * x + theta);
y_meas = y_true + 2*np.random.randn(x.size);
plt.plot(x, y_meas);
plt.plot(x, y_true);
plt.show()

Which gives this image:

Noisy Data

I've created some helper functions, and then I do a least squares fit:

# residual function, e_i
def residuals(p, y, x):
    A, k, theta = p;
    err = y - A * np.sin(2 * np.pi * k * x + theta);
    return err;

def peval(x, p):
    return p[0] * np.sin(2 * np.pi * p[1] * x + p[2]);

# starting values of A, k and theta
p0 = [12, 1 / 2.3e-2, np.pi / 3];
print(np.array(p0));

# do least squares fitting
from scipy.optimize import leastsq

plsq = leastsq(residuals, p0, args=(y_meas, x));
print(plsq[0]); print(np.array([A, k, theta]));

Plotting this gives:

plt.plot(x, peval(x, plsq[0]))
plt.plot(x, y_meas,'ro')
plt.plot(x, y_true);
plt.title('Least-squares fit to noisy data');
plt.legend(['Fit', 'Noisy', 'True']);

Least squares fit to noisy data

When I change my set up to:

x = np.linspace(0, 18e-2, 500);
A, k, theta = 10, 1.0 / 3e-2, np.pi / 6;
y_true = A * np.sin(2 * np.pi * k * x + theta);
y_meas = y_true + 2*np.random.randn(x.size);

(i.e. I triple the length of time I measure), and then run the rest of the code, the fit I get becomes:

Very bad fit

I've tried stepping through the code, but can't come up with a reason that this example is failing.

Upvotes: 1

Views: 247

Answers (1)

Oscar Gargiulo
Oscar Gargiulo

Reputation: 48

if you try with initial guess

p0=[12.0, 35.0, 0.39269908169872414]

it will work in both cases. The initial parameters that you have used are giving a residual > 60e3.

Remember that increasing the number of points will change the problem, if you check the non-linear least squares in https://en.wikipedia.org/wiki/Least_squares you can see that there is a sum over the number of points when the minimization is performed, so it is not unusual that a fit doesn't work when more points are added.

Upvotes: 1

Related Questions