Calculate coefficients in a multivariate linear regression

Question

I am trying to calculate the coefficients using multivariate linear regression. I am using the statsmodels library to calculate the coefficients. The problem is that with this code I get the error ValueError: endog and exog matrices are different sizes. I get this error because with this example the y set has 4 elements, and the X set has a list with 7 ndarrays inside where each list has 5 elements.

But what I don't understand is that, the x set (not X) is a list with 4 lists inside (y has 4 elements), where each list is composed by 7 variables. For me, the x and y have the same number of elements.

How can I fix this error?

import numpy as np
import statsmodels.api as sm

def test_linear_regression():
    x = [[0.0, 1102249463.0, 44055788.0, 9.0, 2.0, 32000.0, 49222464.0], [0.0, 1102259506.0, 44049537.0, 9.0, 2.0, 32000.0, 49222464.0], [0.0, 1102249463.0, 44055788.0, 9.0, 2.0, 32000.0, 49222464.0], [0.0, 1102259506.0, 44049537.0, 10.0, 2.0, 32000.0, 49222464.0]]

    y = [71.7554421425, 37.5205008984, 44.9945571423, 53.5441429615]
    reg_m(y, x)

def reg_m(y, x):
    ones = np.ones(len(x[0]))
    X = sm.add_constant(np.column_stack((x[0], ones)))
    y.append(1)
    for ele in x[1:]:
        X = sm.add_constant(np.column_stack((ele, X)))
    results = sm.OLS(y, X).fit()
    return results


if __name__ == "__main__":
    test_linear_regression()

Calculate coefficients in a multivariate linear regression

Answers (1)

Related Questions