Pointing RandomizedSearchCV to a Classifier

Question

I am using the workflow below in to train a random forest classifier for production use. I am using RandomizedSearchCV to tune the parameters of the classifier by printing the results and then creating a new pipeline using the results of the RandomizedSearchCV. I assume there has to be a way to simply point the best result of a RandomizedSearchCV to a classifier so that I don't have to do it manualy but I can't figure out how.

select = sklearn.feature_selection.SelectKBest(k=40)
clf = sklearn.ensemble.RandomForestClassifier()
steps = [('feature_selection', select),
    ('random_forest', clf)]
parameters = {"random_forest__max_depth": [3, None],
          "random_forest__max_features": sp_randint(1, 21),
          "random_forest__min_samples_split": sp_randint(1, 21),
          "random_forest__min_samples_leaf": sp_randint(1, 21),
          "random_forest__bootstrap": [True, False],
          "random_forest__criterion": ["gini", "entropy"]}
pipeline = sklearn.pipeline.Pipeline(steps)
n_iter_search = 20
cv = RandomizedSearchCV(pipeline, param_distributions = parameters, n_iter=n_iter_search)
cv.fit(X,y)

dooms · Accepted Answer

I don't know if in the RandomizedSearchCV object, the remaining estimator is the best one or the last one fitted. You can access the best_estimator_ attribute in order to be sure that you're getting the best model.

Pointing RandomizedSearchCV to a Classifier

Answers (1)

Related Questions