Reputation: 747

get beta coefficients of regression model in Python

dataset = pd.read_excel('dfmodel.xlsx')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

from sklearn.linear_model import LinearRegression
regressor = LinearRegression()
regressor.fit(X_train, y_train)

from sklearn.metrics import r2_score
print('The R2 score of Multi-Linear Regression model is: ',r2_score(y_test,y_pred))

With the code above, I managed to do a linear regression and get the R2. How do I get the beta coefficients of each predictor variable?

Upvotes: 1

Answers (3)

Zach Oakes

Reputation: 183

Personally, I prefer the single step of np.polyfit() with 1 degree specified.

import numpy as np
np.polyfit(X,y,1)[0]  #returns beta + other coeffs if > 1 degree.

so your question, if I'm understanding, your looking to calculate the predicted y values against the initial y -- would be this:

np.polyfit(y_test,y_pred,1)[0]

I would test np.polyfit(x_test,y_pred)[0] instead though.

Upvotes: 2

andrew_reece

Reputation: 21274

Use regressor.coef_. You can see how these coefficients map on in order of the predictor variables by comparing against a statsmodels implementation:

from sklearn.linear_model import LinearRegression

regressor = LinearRegression(fit_intercept=False)
regressor.fit(X, y)

regressor.coef_ 
# array([0.43160901, 0.42441214])

statsmodels version:

import statsmodels.api as sm

sm.add_constant(X)
mod = sm.OLS(y, X)
res = mod.fit()

print(res.summary())
                                 OLS Regression Results                                
=======================================================================================
Dep. Variable:                      y   R-squared (uncentered):                   0.624
Model:                            OLS   Adj. R-squared (uncentered):              0.623
Method:                 Least Squares   F-statistic:                              414.0
Date:                Tue, 29 Sep 2020   Prob (F-statistic):                   1.25e-106
Time:                        17:03:27   Log-Likelihood:                         -192.54
No. Observations:                 500   AIC:                                      389.1
Df Residuals:                     498   BIC:                                      397.5
Df Model:                           2                                                  
Covariance Type:            nonrobust                                                  
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             0.4316      0.041     10.484      0.000       0.351       0.512
x2             0.4244      0.041     10.407      0.000       0.344       0.505
==============================================================================
Omnibus:                       36.830   Durbin-Watson:                   1.967
Prob(Omnibus):                  0.000   Jarque-Bera (JB):               13.197
Skew:                           0.059   Prob(JB):                      0.00136
Kurtosis:                       2.213   Cond. No.                         2.57
==============================================================================

Warnings:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

You can do a direct equivalency test with:

np.array([regressor.coef_.round(8) == res.params.round(8)]).all() # True

Upvotes: 0

quizzical_panini

Reputation: 434

From sklearn.linear_model.LinearRegression documentation page you can find the coefficients (slope) and intercept at regressor.coef_ and regressor.intercept_ respectively.

If you use sklearn.preprocessing.StandardScaler before fitting your model then the regression coefficients should be the Beta coefficients you're looking for.

Upvotes: 1

get beta coefficients of regression model in Python

Answers (3)

Related Questions