Getting a series of Normal Distributions from a Least Squares regression

Question

I am not particularly good at math. I would like to get some breadcrumbs about how to solve the following formula using python code.

Assume an [m,n] matrix M and a [1,n] vector y.
Solve for the least squares using scipy.linalg.lstsq(M, y).
The output will be an [m,1] vector of coefficients a in the equation Ma=y.

As per this question, any vector of solutions like a in a regression is basically a series of single points each taken from a normal distribution that describes the error of every point on the regression. In effect, every single digit in the solution vector a is the mean of a normal distribution of errors centred on zero.

I would like to find those normal distributions rather than the scalar value for every single point in the solution. Apologies for the poor description of the mathy bits, I was never trained in math in Uni.

Kapocsi · Accepted Answer

Here is a hint. Let me know if you want more.

scipy.linalg.lstsq(M, y) returns four things:

x : (N,) or (N, K) ndarray
  Least-squares solution.

residues : (K,) ndarray or float
  Square of the 2-norm for each column in b - a x, if M > N and ndim(A) == n
  (returns a scalar if b is 1-D). Otherwise a (0,)-shaped array is returned.

rank : int
  Effective rank of a.

s : (min(M, N),) ndarray or None
  Singular values of a. The condition number of a is s[0] / s[-1].

residues is going to be of interest to you!

https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lstsq.html

Getting a series of Normal Distributions from a Least Squares regression

Answers (2)

Related Questions