Reputation: 705
In my project, I am using GridSearchCV
in sklearn
to exhaustively search over specified parameter values for a model to find the best possible parameter values. I just test that in RandomForestClassifier and helped me to find the best max_depth
and n_estimators
. Based on that, I have two questions:
GridSearchCV
use the concept of Maximum Likelihood Estimation (MLE) under the hood?GridSearchCV
for every model, is there a technique that I can use to choose the best model for my dataset? I think this is under the concept of model selection but I don't know how to use it via sklearn
. Thank you
Upvotes: 1
Views: 10805
Reputation: 66805
Does GridSearchCV use the concept of Maximum Likelihood Estimation (MLE) under the hood?
MLE is probabilistic reasoning, thus it can be only applied to probabilistic models. GridSearchCV is not MLE based, it is a simple trick to do model selection based on direct estimation of the test error.So given a particular model, it can assign a number which represents how good it is - given many models, you can simply select the one with the biggest number (highest estimated generalization strength).
Instead of using GridSearchCV for every model, is there a technique that I can use to choose the best model for my dataset? I think this is under the concept of model selection but I don't know how to use it via sklearn.
There are plenty, however sklearn pretty much implements only various train-test splitters (CV, random etc.); instead you might want to consider other libraries that support:
Which are more advanced methods of looking for good hyperparamters (rather than just checking already existing ones).
Upvotes: 3