devautor
devautor

Reputation: 2586

XGBoost early stopping cv versus GridSearchCV

I am trying XGBoost to solve a regression problem. In the process of hyperparameter tuning, XGBoost's early stopping cv never stops for my code/data, whatever the parameter num_boost_round is set to be. Also, it produces poorer RMSE scores than GridSearchCV. What am I doing wrong here? And, if I am not doing anything wrong, what advantages then early stopping cv offers over GridSearchCV?

GridSearchCV:

import math
def RMSE(y_true, y_pred):
    rmse = math.sqrt(mean_squared_error(y_true, y_pred))
    print 'RMSE: %2.3f' % rmse
    return rmse
scorer = make_scorer(RMSE, greater_is_better=False)

cv_params = {'max_depth': [2,8], 'min_child_weight': [1,5]}
ind_params = {'learning_rate': 0.01, 'n_estimators': 1000, 
              'seed':0, 'subsample': 0.8, 'colsample_bytree': 0.8,
             'reg_alpha':0, 'reg_lambda':1} #regularization => L1 : alpha, L2 : lambda
optimized_GBM = GridSearchCV(xgb.XGBRegressor(**ind_params), 
                             cv_params, 
                             scoring = scorer, 
                             cv = 5, verbose=1,
                             n_jobs = 1)
optimized_GBM.fit(train_X, train_Y)
optimized_GBM.grid_scores_

Output:

[mean: -62.42736, std: 5.18004, params: {'max_depth': 2, 'min_child_weight': 1},
 mean: -62.42736, std: 5.18004, params: {'max_depth': 2, 'min_child_weight': 5},
 mean: -57.11358, std: 3.62918, params: {'max_depth': 8, 'min_child_weight': 1},
 mean: -57.12148, std: 3.64145, params: {'max_depth': 8, 'min_child_weight': 5}]

XGBoost CV:

our_params = {'eta': 0.01, 'max_depth':8, 'min_child_weight':1,
              'seed':0, 'subsample': 0.8, 'colsample_bytree': 0.8, 
             'objective': 'reg:linear', 'booster':'gblinear', 
              'eval_metric':'rmse',
             'silent':False}
num_rounds=1000

cv_xgb = xgb.cv(params = our_params, 
                dtrain = train_mat, 
                num_boost_round = num_rounds, 
                nfold = 5,
                metrics = ['rmse'], # Make sure you enter metrics inside a list or you may encounter issues!
                early_stopping_rounds = 100, # Look for early stopping that minimizes error
               verbose_eval = True) 

print cv_xgb.shape
print cv_xgb.tail(5)

Output:

(1000, 4)
     test-rmse-mean  test-rmse-std  train-rmse-mean  train-rmse-std
995       89.937926       0.263546        89.932823        0.062540
996       89.937773       0.263537        89.932671        0.062537
997       89.937622       0.263526        89.932517        0.062535
998       89.937470       0.263516        89.932364        0.062532
999       89.937317       0.263510        89.932210        0.062525

Upvotes: 3

Views: 2544

Answers (1)

ftiaronsem
ftiaronsem

Reputation: 1584

I have the same issue with XGboost ignoring num_boost_rounds (when early stopping is specified) and continuing to fit. I would wager that this is a bug.

As for the advantages of early stopping over GridSearchCV:

The advantage is that you don't have to try a series of values for num_boost_rounds, but you automatically stop at the best.

Early stopping is designed to find the optimum number of boosting iterations. If you specify a very large number for num_boost_round (i.e. 10000) and the best number of trees turns out to be 5261 it will stop at 5261+early_stopping_rounds, giving you a model that is pretty close to the optimum.

If you wanted to find the same optimum using GridSearchCV without early stopping rounds you would have to try many different values of num_boost_rounds (i.e. 100,200,300,...,5000,5100,5200,5300,...etc...). This would take a much longer time.

The property that early stopping is exploiting is that there is an optimal number of boosting steps after which the validation error while start to increase. So ....

why doesn't it work for your case?

impossible to say precisely without having the data, but it is probably because of a combination of the following:

  • num_boost_round is too small (and you run into the bug where xgboost resets and starts over, creating an neverending loop)
  • early_stopping_rounds is too large (maybe your data has a strongly oscillating convergence behavior. Try a smaller value and see whether the CV error is good enough)
  • something might be strange about your validation data

Why are you seeing different results between GridSearchCV and xgboost.cv?

Difficult to tell without having a fully working example, but have you checked all the default values for the variables that you only specify in one of the two interfaces (like 'reg_alpha':0, 'reg_lambda':1, 'objective': 'reg:linear', 'booster':'gblinear') and whether your definition of RMSE exactly matches xgboost's definition?

Upvotes: 2

Related Questions