Reputation: 781
I have a sklearn pipeline with PolynomialFeatures()
and LinearRegression()
in series. My aim is to fit data to this using different degree
of the polynomial features and measure the score. The following is the code I use -
steps = [('polynomials',preprocessing.PolynomialFeatures()),('linreg',linear_model.LinearRegression())]
pipeline = pipeline.Pipeline(steps=steps)
scores = dict()
for i in range(2,6):
params = {'polynomials__degree': i,'polynomials__include_bias': False}
#pipeline.set_params(**params)
pipeline.fit(X_train,y=yCO_logTrain,**params)
scores[i] = pipeline.score(X_train,yCO_logTrain)
scores
I receive the error - TypeError: fit() got an unexpected keyword argument 'degree'
.
Why is this error thrown even though the parameters are named in the format <estimator_name>__<parameter_name>
?
Upvotes: 1
Views: 2217
Reputation: 4275
As per sklearn.pipeline.Pipeline
documentation:
**fit_paramsdict of string -> object Parameters passed to the fit method of each step, where each parameter name is prefixed such that parameter p for step s has key s__p.
This means that the parameters passed this way are directly passed to s
step .fit()
method. If you check PolynomialFeatures documentation, degree
argument is used in construction of the PolynomialFeatures
object, not in its .fit()
method.
If you want to try different hyperparameters for estimators/transformators within a pipeline, you could use GridSearchCV as shown here. Here's an example code from the link:
from sklearn.pipeline import Pipeline
from sklearn.feature_selection import SelectKBest
pipe = Pipeline([
('select', SelectKBest()),
('model', calibrated_forest)])
param_grid = {
'select__k': [1, 2],
'model__base_estimator__max_depth': [2, 4, 6, 8]}
search = GridSearchCV(pipe, param_grid, cv=5).fit(X, y)
Upvotes: 3