Reputation: 11
I am using Xgboost.train to train my model; however, i am not sure how to obtain the booster from the best iteration instead of the booster from the last iteration.
xgb1=xgb.train(self.params,xgtrain,num_boost_round=self.num_boost_round,early_stopping_rounds=self.early_stopping_rounds,evals=watchlist)
print xgb1.best_score
print xgb1.best_iteration
print xgb1.best_ntree_limit
Upvotes: 1
Views: 5088
Reputation: 591
ntree_limit
is deprecated, you can use iteration_range
, by default xgboost uses all the trees for prediction, if iteration_range
is specified the trees in the range will be used for prediciton, you can combine this with a learning curve to get the same result as early_stopping_rounds
, so if you see your validation set performance stagnate after n
rounds you can use that in the iteration_range
to get predictions from the booster model till n
preds = xgb1.predict(testset, iteration_range = (0, n))
Upvotes: 5
Reputation: 1054
You can setup this when do prediction in the model as:
preds = xgb1.predict(testset, ntree_limit=xgb1.best_iteration)
Or by using the param early_stopping_rounds
that guarantee that you'll get the tree nearby the best tree. But be careful with this param, cause the evaluation value can be in a local minimum or maximum (depending of the evaluation function).
You can see here too: https://github.com/dmlc/xgboost/issues/264
Upvotes: 3