Dimitris
Dimitris

Reputation: 11

In sklearn, how can I get which coefficient corresponds to which parameter in a polynomial linear regression?

I am doing a linear regression with scikit-learn in Python3. I have an array of x and y data and want to implement a linear regression using a 3rd degree polynomial (and then apply a fitted line to my data). After that, I want to figure out what the actual equation of this polynomial is. However, I do not know the order of the results when I use themodel.coeff_ command.

Btw, I have only one independent variable x. Let's suppose the equation I need is of the type: y = a*x + b*x^2 + c*x^3 + intercept. I tried using the model.coeff_ command but I am not sure what is the order of the printed result.

    # The data
    ----------------------
    utility = np.array([100, 96.64, 43.94, 24.48, 0, 0.05])
    windiness = np.array([0, 2.5, 6.7, 12.3, 15.5, 19, 20])
    windiness = windiness[:, np.newaxis]
    utility = utility[:, np.newaxis]

    # Regression
    -----------------------
    polynomial_features= PolynomialFeatures(degree=3)
    x_poly = polynomial_features.fit_transform(windiness)
    model = LinearRegression()
    model.fit(x_poly, utility)
    y_poly_pred = model.predict(x_poly)

So running print(model.coef_) outputs

[[ 0.        , -6.78066221, -0.19310896,  0.01361347]]

But which number is a, which is b etc?

Upvotes: 1

Views: 2230

Answers (1)

Rheatey Bash
Rheatey Bash

Reputation: 847

First of All, your windiness variable contains one extra value so you need to delete one, make sure that both variable input and output have length. Now, let's move ahead with your updated code. I have removed 0 for simplicity.

utility = np.array([100, 96.64, 43.94, 24.48, 0, 0.05])
windiness = np.array([2.5, 6.7, 12.3, 15.5, 19, 20])
windiness = windiness[:, np.newaxis]
utility = utility[:, np.newaxis]

polynomial_features= PolynomialFeatures(degree=3)
x_poly = polynomial_features.fit_transform(windiness)
model = LinearRegression()
model.fit(x_poly, utility)
y_poly_pred = model.predict(x_poly)

Now, let's print your new transformed feature vector

print(x_poly)

You should get output similar to this

[[1.000000e+00 2.500000e+00 6.250000e+00 1.562500e+01]
 [1.000000e+00 6.700000e+00 4.489000e+01 3.007630e+02]
 [1.000000e+00 1.230000e+01 1.512900e+02 1.860867e+03]
 [1.000000e+00 1.550000e+01 2.402500e+02 3.723875e+03]
 [1.000000e+00 1.900000e+01 3.610000e+02 6.859000e+03]
 [1.000000e+00 2.000000e+01 4.000000e+02 8.000000e+03]]

Here, We can see that the first feature is X^0, second is X^1, third is X^2, and the fourth one is X^3. Now, it has changed from polynomial to the equivalent linear model.

Your model coefficient can be seen by this print(model.coef_). You get this [[ 0.0 11.125 -1.718 0.047]]

Now, let's simulate 3rd order polynomial as follows y[0] = model.intercept_ + 0.0 * X[0]^0 + 11.125 * X[0]^1 + (-1.718) * X[0]^2 + 0.047*X[0]^3

Long story in short, the coefficients are as follows a = 11.125 b = -1.718 c = 0.047.

Upvotes: 1

Related Questions