Reputation: 5240
suppose if i am doing hyper parameter tuning to one of my model, lets say, i am using AdaBoostClassifier() and want to pass different base_estimator, so i pass SVC & DecisionTreeClassifier as estimator
_parameters=[
{
'mdl':[AdaBoostClassifier(random_state=23)],
'mdl__learning_rate':np.linspace(0,1,20),
'mdl__base_estimator':[SVC(),DecisionTreeClassifier()]
}
]
now, i want to pass different values to ccp_alpha of DecisionTreeClassifier, something like this
'mdl__base_estimator':[LinearRegression(),DecisionTreeClassifier(ccp_alpha=[0.1,0.2,0.3,0.4])]
how can i do that, i tried passing it like this, but it is not working, here is my entire code
pipeline=Pipeline(
[
('scal',StandardScaler()),
('mdl','passthrough')
]
)
_parameters=[
{
'mdl':[DecisionTreeClassifier(random_state=42)] ,
'mdl__max_depth':np.linspace(2,30,2),
'mdl__min_samples_split':np.linspace(1,10,1),
'mdl__max_features':np.linspace(1,100,1),
'mdl__ccp_alpha':np.linspace(0,1,10)
}
,{
'mdl':[AdaBoostClassifier(random_state=23)],
'mdl__learning_rate':np.linspace(0,1,20),
'mdl__base_estimator':[SVC(),DecisionTreeClassifier(ccp_alpha=[0.3,0.4,0.5,0.7])]
}
]
grid_search=GridSearchCV(_pipeline,_parameters,cv=3,n_jobs=-1,scoring='f1')
grid_search.fit(x,y
)
Upvotes: 0
Views: 187
Reputation: 12698
This kind of splitting is why param_grid
can be a list of dicts, as in your outer split; but it cannot easily handle the nested disjunction you have. Two approaches come to mind.
More disjoint grids:
_parameters=[
{
'mdl': [DecisionTreeClassifier(random_state=42)],
'mdl__max_depth': np.linspace(2,30,2),
'mdl__min_samples_split': np.linspace(1,10,1),
'mdl__max_features': np.linspace(1,100,1),
'mdl__ccp_alpha': np.linspace(0,1,10),
},
{
'mdl': [AdaBoostClassifier(random_state=23)],
'mdl__learning_rate': np.linspace(0,1,20),
'mdl__base_estimator': [SVC()],
},
{
'mdl': [AdaBoostClassifier(random_state=23)],
'mdl__learning_rate': np.linspace(0,1,20),
'mdl__base_estimator': [DecisionTreeClassifier()],
'mdl__base_estimator__ccp_alpha': [0.3,0.4,0.5,0.7],
},
]
Or list comprehension:
_parameters=[
{
'mdl': [DecisionTreeClassifier(random_state=42)],
'mdl__max_depth': np.linspace(2,30,2),
'mdl__min_samples_split': np.linspace(1,10,1),
'mdl__max_features': np.linspace(1,100,1),
'mdl__ccp_alpha': np.linspace(0,1,10),
},
{
'mdl': [AdaBoostClassifier(random_state=23)],
'mdl__learning_rate': np.linspace(0,1,20),
'mdl__base_estimator': [SVC()] + [DecisionTreeClassifier(ccp_alpha=a) for a in [0.3,0.4,0.5,0.7]],
},
]
Upvotes: 1