Reputation: 32966
For a set of points, I want to get the straight line that is the closest approximation of the points using a least squares fit.
I can find a lot of overly complex solutions here on SO and elsewhere but I have not been able to find something simple. And this should be very simple.
x = np. array([1, 2, 3, 4])
y = np. array([23, 31, 42, 43 ])
slope, intercept = leastSquares(x, y)
Is there some library function that implements the above leastSquares()?
Upvotes: 1
Views: 1830
Reputation: 67
You can try with Moore-Penrose pseudo-inverse:
from scipy import linalg
x = np. array([1, 2, 3, 4])
y = np. array([23, 31, 42, 43 ])
x = np.array([x, np.ones(len(x))])
B = linalg.pinv(x)
sol = np.reshape(y,(1,len(y))) @ B
slope, intercept = sol[0,0], sol[0,1]
Upvotes: 0
Reputation: 426
numpy.linalg.lstsq
can compute such a fit for you. There is an example in the documentation that does exactly what you need.
https://numpy.org/doc/stable/reference/generated/numpy.linalg.lstsq.html#numpy-linalg-lstsq
To summarize it here …
>>> x = np.array([0, 1, 2, 3])
>>> y = np.array([-1, 0.2, 0.9, 2.1])
>>> A = np.stack([x, np.ones(len(x))]).T
>>> m, c = np.linalg.lstsq(A, y, rcond=None)[0]
>>> m, c
(1.0 -0.95) # may vary
Upvotes: 1
Reputation: 541
Here's one way to implement the least squares regression:
import numpy as np
x = np. array([1, 2, 3, 4])
y = np. array([23, 31, 42, 43 ])
def leastSquares(x, y):
A = np.vstack([x, np.ones(len(x))]).T
y = y[:, np.newaxis]
slope, intercept = np.dot((np.dot(np.linalg.inv(np.dot(A.T,A)),A.T)),y)
return slope, intercept
slope, intercept = leastSquares(x, y)
Upvotes: 0
Reputation: 9647
Well for one, I think for an ordinary least squares fit with a single line you can derive a closed-form solution for the coefficients, if I'm not utterly mistaken. Though there's some pitfalls with numerical stability.
If you look for least squares in general, you'll find more general and thus more complex solutions, because least squares can be done for many more models than just the linear one.
But maybe the sklearn
package with its LinearRegression
model might do easily what you want to do? https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html
or for more detailed control the scipy
package, https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lstsq.html
import numpy as np
from scipy.linalg import lstq
# Turn x into 2d array; raise it to powers 0 (for y-axis-intercept)
# and 1 (for the slope part)
M = x[:, np.newaxis] ** [0, 1]
p, res, rnk, s = lstq(M, y)
intercept, slope = p[0], p[1]
Upvotes: 1