Reputation: 4636
I've built a few different linear regressions, using the same group of predictor variables, as you can see below:
model=LinearRegression()
model.fit(X=predictor_train,y=target_train)
prediction_train=model.predict(predictor_train)
pred=model.predict(main_frame.iloc[-1:,1:])
To create the predictions of the target variable, I suppose that the Scikit algorithm created an equation with those "predictor variables". My question is: How do I access that equation?
Upvotes: 4
Views: 2227
Reputation: 9095
from math import fabs
import pandas as pd
from sklearn.linear_model import LinearRegression
def get_regression_formula(df, independent_vars, dependent_var):
X = df[independent_vars]
y = df[dependent_var]
regression = LinearRegression()
regression.fit(X, y)
formula = [f"{regression.intercept_:.2f} "]
for i, var in enumerate(independent_vars):
coef = regression.coef_[i]
coef_abs = fabs(coef)
if coef_abs < 0.1:
continue
formula.append(f"{'+' if coef > 0 else '-'} {coef_abs:.2f} * {var} ")
return f"{dependent_var} = {''.join(formula)}"
Usage example:
>>> df.columns
Index(['x1', 'x2', 'x3', 'y'], dtype='object')
>>> print(get_regression_formula(df, ["x1", "x2", "x3"], "y")
Upvotes: 1
Reputation: 1849
You're looking for params = model.coef_
. This returns an array with the weight of each model input.
Note that this is a linear equation, so to get the prediction for yourself, you want to form an equation such that your prediction, y = sum([input[i] * params[i]])
, if you had some input array called input
. This is the dot product, if you're familiar with linear algebra between the parameter vector and the feature vector.
Upvotes: 6