Reputation: 241
While doing GridSearchCV, what is the difference between the scores obtained through grid.score(...) and grid.best_score_
Kindly assume that a model, features, target, and param_grid are in place. Here is a part of the code I am very curious to know about.
grid = GridSearchCV(X_train, y_train)
grid.fit(X_train, y_train)
scores = grid.score(estimator=my_model, param_grid=params, cv=3,
return_train_score=True, scoring='neg_mean_squared_error')
best_score_1 = scores
best_score_2 = grid.best_score_
There are two different outputs for each of best_score_1 and best_score_2 I am trying to know the difference between the two as well as which of the following should be considered to be the best scores that came out from the given param_grid.
Following is the full function.
def apply_grid (df, model, features, target, params, test=False):
'''
Performs GridSearchCV after re-splitting the dataset, provides
comparison between train's MSE and test's MSE to check for
Generalization and optionally deploys the best-found parameters
on the Test Set as well.
Args:
df: DataFrame
model: a model to use
features: features to consider
target: labels
params: Param_Grid for Optimization
test: False by Default, if True, predicts on Test
Returns:
MSE scores on models and slice from the cv_results_
to compare the models generalization performance
'''
my_model = model()
# Split the dataset into train and test
X_train, X_test, y_train, y_test = train_test_split(df[features],
df[target], random_state=0)
# Resplit the train dataset for GridSearchCV into train2 and valid to keep the test set separate
X_train2, X_valid, y_train2, y_valid = train_test_split(train[features],
train[target] , random_state=0)
# Use Grid Search to find the best parameters from the param_grid
grid = GridSearchCV(estimator=my_model, param_grid=params, cv=3,
return_train_score=True, scoring='neg_mean_squared_error')
grid.fit(X_train2, y_train2)
# Evaluate on Valid set
scores = grid.score(X_valid, y_valid)
scores = scores # CONFUSION
print('Best MSE through GridSearchCV: ', grid.best_score_) # CONFUSION
print('Best MSE through GridSearchCV: ', scores)
print('I AM CONFUSED ABOUT THESE TWO OUTPUTS ABOVE. WHY ARE THEY DIFFERENT')
print('Best Parameters: ',grid.best_params_)
print('-'*120)
print('mean_test_score is rather mean_valid_score')
report = pd.DataFrame(grid.cv_results_)
# If test is True, deploy the best_params_ on the test set
if test == True:
my_model = model(**grid.best_params_)
my_model.fit(X_train, y_train)
predictions = my_model.predict(X_test)
mse = mean_squared_error(y_test, predictions)
print('TEST MSE with the best params: ', mse)
print('-'*120)
return report[['mean_train_score', 'mean_test_score']]
Upvotes: 0
Views: 818
Reputation: 56
UPDATED
As explained in the sklearn documentation, GridSearchCV takes all the parameter lists of parameters you pass and tries all possible combinations to find the best parameters.
To evaluate which are the best parameters, it calculates a k-fold cross-validation for each parameters combination. With k-fold cross-validation, the training set is divided into Training set and Validation set (which is a test set). If you choose, for example, cv=5
the dataset is divided into 5 non-overlapping folds, and each fold is used as a validation set, while all the other are used as training set. Hence, GridSearchCV, in the example, calculates the average validation score (which can be accuracy or something else) for each of the 5 folds, and does so for each parameters combination. Then, at the end of GridsearchCV there will be an average validation score for each parameter combination, and the one with the highest average validation score is returned. So, the average validation score, associated to the best parameters, is stored in the grid.best_score_
variable.
On the other hand, the grid.score(X_valid, y_valid)
method gives the score on the given data, if the estimator has been refitted (refit=True)
.This means that it is not the average accuracy of the 5 folds, but is taken the model with the best parameters and is trained using the training set. Then, are computed the predictions on the X_valid
and compared compared with the y_valid
in order to get the score.
Upvotes: 1