Reputation: 437
I have the following pipeline:
from sklearn.pipeline import Pipeline
import lightgbm as lgb
steps_lgb = [('lgb', lgb.LGBMClassifier())]
# Create the pipeline: composed of preprocessing steps and estimators
pipe = Pipeline(steps_lgb)
Now I want to set the parameters of the classifier using the following command:
best_params = {'boosting_type': 'dart',
'colsample_bytree': 0.7332216010898506,
'feature_fraction': 0.922329814019706,
'learning_rate': 0.046566283755421566,
'max_depth': 7,
'metric': 'auc',
'min_data_in_leaf': 210,
'num_leaves': 61,
'objective': 'binary',
'reg_lambda': 0.5185517505019249,
'subsample': 0.5026815575448366}
pipe.set_params(**best_params)
This however raises an error:
ValueError: Invalid parameter boosting_type for estimator Pipeline(steps=[('estimator', LGBMClassifier())]). Check the list of available parameters with `estimator.get_params().keys()`.
boosting_type is definitely a core parameter of the lightgbm framework, if removed however (from best_params
) other parameters cause the valueError
to be raised.
So, what I want is to set the parameters of the classifier after a pipeline is created.
Upvotes: 1
Views: 3438
Reputation: 60318
When using pipelines, you need to prefix the parameters depending on which part of the pipeline they refer to with the name of the respective component (here lgb
) followed by a double uncerscore (lgb__
); the fact that here your pipeline consists of only a single element does not change this requirement.
So, your parameters should be like (only the first 2 elements shown):
best_params = {'lgb__boosting_type': 'dart',
'lgb__colsample_bytree': 0.7332216010898506
}
You would have discovered this yourself if you had followed the advice clearly offered in your error message:
Check the list of available parameters with `estimator.get_params().keys()`.
In your case,
pipe.get_params().keys()
gives
dict_keys(['memory',
'steps',
'verbose',
'lgb',
'lgb__boosting_type',
'lgb__class_weight',
'lgb__colsample_bytree',
'lgb__importance_type',
'lgb__learning_rate',
'lgb__max_depth',
'lgb__min_child_samples',
'lgb__min_child_weight',
'lgb__min_split_gain',
'lgb__n_estimators',
'lgb__n_jobs',
'lgb__num_leaves',
'lgb__objective',
'lgb__random_state',
'lgb__reg_alpha',
'lgb__reg_lambda',
'lgb__silent',
'lgb__subsample',
'lgb__subsample_for_bin',
'lgb__subsample_freq'])
Upvotes: 3