aabujamra
aabujamra

Reputation: 4636

Python/Scikit-learn - Linear Regression - Access to Linear Regression Equation

I've built a few different linear regressions, using the same group of predictor variables, as you can see below:

model=LinearRegression()
model.fit(X=predictor_train,y=target_train)
prediction_train=model.predict(predictor_train)
pred=model.predict(main_frame.iloc[-1:,1:])

To create the predictions of the target variable, I suppose that the Scikit algorithm created an equation with those "predictor variables". My question is: How do I access that equation?

Upvotes: 4

Views: 2227

Answers (2)

Mugen
Mugen

Reputation: 9095

from math import fabs
import pandas as pd
from sklearn.linear_model import LinearRegression

def get_regression_formula(df, independent_vars, dependent_var):
    X = df[independent_vars]
    y = df[dependent_var]
    regression = LinearRegression()
    regression.fit(X, y)
    formula = [f"{regression.intercept_:.2f} "]
    for i, var in enumerate(independent_vars):
        coef = regression.coef_[i]
        coef_abs = fabs(coef)
        if coef_abs < 0.1:
            continue
        formula.append(f"{'+' if coef > 0 else '-'} {coef_abs:.2f} * {var} ")
    return f"{dependent_var} = {''.join(formula)}"

Usage example:

>>> df.columns
Index(['x1', 'x2', 'x3', 'y'], dtype='object')
>>> print(get_regression_formula(df, ["x1", "x2", "x3"], "y")

Upvotes: 1

Alex Alifimoff
Alex Alifimoff

Reputation: 1849

You're looking for params = model.coef_. This returns an array with the weight of each model input.

Note that this is a linear equation, so to get the prediction for yourself, you want to form an equation such that your prediction, y = sum([input[i] * params[i]]), if you had some input array called input. This is the dot product, if you're familiar with linear algebra between the parameter vector and the feature vector.

Upvotes: 6

Related Questions