Kirill
Kirill

Reputation: 1

Prediction biases in RandomForestRegressor when using GridSearchCV

enter image description here

what parameter can be used for GridSearchCV and RandomForestRegressor to rotate the predictions, if removed so that the model shows the result much better?

start = time.time()

param_dists = {'max_features': ["sqrt", "log2", None], #по умолчанию None
               'max_depth': [None, range(1, 50)], #по умолчанию None
               'criterion': ['mae', 'squared_error'] #по умолчанию squared_error
              }



model_RandomForestRegressor_TuneSearchCV = RandomForestRegressor(random_state=270223)  

grid_RandomForestRegressor = GridSearchCV(model_RandomForestRegressor_TuneSearchCV,
                                          param_dists,
                                          cv=3,
                                          n_jobs=-1,
                                          scoring='neg_mean_absolute_error')

grid_RandomForestRegressor.fit(X_train, y_train)

best_score = -1 * grid_RandomForestRegressor.best_score_
predict_dt = grid_RandomForestRegressor.predict(X_test)

time_hyperopt_RandomForestRegressor = round((time.time() - start), 2)

print("MAE модели случайного леса с гиперпараметрами:", best_score)
print('Лучшие параметры', grid_RandomForestRegressor.best_params_)


results_model = {'Модель': 'RandomForestRegressor с гиперпараметрами',
                 'MAE': best_score,
                 'Общее время':time_hyperopt_RandomForestRegressor
              }

results = results.append(results_model, ignore_index=True)

Metric - MAE

now MAE of random forest model with hyperparameters: 5.564043154396905 Best options {'criterion': 'mae', 'max_depth': None, 'max_features': None}

Upvotes: 0

Views: 43

Answers (1)

Téo
Téo

Reputation: 309

I think you're misunderstanding the implications of the results in your plot.

For the grouping in the middle, your model should output similar results, since they are all in a similar band of true values (y-axis), instead your model predicts a large variety of different values. This means that your model is NOT learning the underlying mapping.

You just need to train a better model.

Upvotes: 0

Related Questions