Understanding K-Fold Cross-Validation, Model Training, and R² Scores

Question

I'm working with K-Fold Cross-Validation in a Grid Search setup for hyperparameter tuning. I have a few questions about how the model is trained and evaluated:

When I use GridSearchCV, the model is evaluated across multiple folds (let's say 10). For each hyperparameter combination, the model is trained on ( K-1 ) folds and validated on the remaining fold. When I obtain the best_grid model after the grid search, which specific training data (i.e., which folds) was this model trained on?
When I call best_grid.predict(X_test), on which dataset is this model making predictions? Has it been trained on the entire dataset after the grid search, or is it still based on the folds used during cross-validation?
If the best_grid model has not been trained on the entire dataset yet, do I need to explicitly fit it to the full dataset again before making predictions?

I want to get the R² train score, but I'm confused about the score I receive when using the following Code:

param_grid = {f'regressor__regressor__{param}': values for param, values in model_info['params'].items()}
     grid_search = GridSearchCV(full_pipeline, param_grid, cv=stratified_kf.split(X, y_binned), scoring="r2", n_jobs=4, return_train_score=True)

     grid_search.fit(X,y)

     if grid_search.best_score_ > best_score:
         best_score = grid_search.best_score_
         best_model = model_name
         best_grid = grid_search


 mean_train_score = best_grid.cv_results_['mean_train_score'][best_grid.best_index_] #
 print(mean_train_score) # THIS THING HERE

Understanding K-Fold Cross-Validation, Model Training, and R² Scores

Answers (1)

Related Questions

Understanding K-Fold Cross-Validation, Model Training, and R&#178; Scores

Answers (1)

Related Questions

Understanding K-Fold Cross-Validation, Model Training, and R² Scores