Reputation: 4577
I'm fitting a simple polynomial regression model, and I want get the coefficients from the fitted model.
Given the prep code:
import pandas as pd
from itertools import product
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline
# data creation
sa = [1, 0, 1, 2, 3]
sb = [2, 1, 0, 1, 2]
raw = {'a': [], 'b': [], 'w': []}
for (ai, av), (bi, bv) in product(enumerate(sa), enumerate(sb)):
raw['a'].append(ai)
raw['b'].append(bi)
raw['w'].append(av + bv)
data = pd.DataFrame(raw)
# regression
x = data[['a', 'b']].values
y = data['w']
poly = PolynomialFeatures(2)
linr = LinearRegression()
model = make_pipeline(poly, linr)
model.fit(x, y)
From this answer, I know the coefficients can obtained using with
model.steps[1][1].coef_
>>> array([ 0.00000000e+00, -5.42857143e-01, -1.71428571e+00,
2.85714286e-01, 1.72774835e-16, 4.28571429e-01])
But this provides a 1-dimensional array and I'm not sure which numbers correspond to which variables.
Are they ordered as a0, a1, a2, b0, b1, b2 or as a0, b0, a1, b1, a2, b2?
Upvotes: 1
Views: 3833
Reputation: 36599
You can use the get_feature_names()
of the PolynomialFeatures
to know the order.
In the pipeline you can do this:
model.steps[0][1].get_feature_names()
# Output:
['1', 'x0', 'x1', 'x0^2', 'x0 x1', 'x1^2']
If you have the names of the features with you ('a', 'b' in your case), you can pass that to get actual features.
model.steps[0][1].get_feature_names(['a', 'b'])
# Output:
['1', 'a', 'b', 'a^2', 'a b', 'b^2']
Upvotes: 3
Reputation: 497
First, the coefficients of a polynomial of degree 2 are 1, a, b, a^2, ab, and b^2 and they come in this order in the scikit-learn implementation. You can verify this by creating a simple set of inputs, e.g.
x = np.array([[2, 3], [2, 3], [2, 3]])
print(x)
[[2 3]
[2 3]
[2 3]]
And then creating the polynomial features:
poly = PolynomialFeatures(2)
x_poly = poly.fit_transform(x)
print(x_poly)
[[1. 2. 3. 4. 6. 9.]
[1. 2. 3. 4. 6. 9.]
[1. 2. 3. 4. 6. 9.]]
You can see that the first and second feature are a and b (without counting the bias coefficient 1), the third feature is a^2 (i.e. 2^2), the fourth is ab=2*3, and the last is b^2=3^2. i.e. you model is:
Upvotes: 1