Reputation: 1554
When I am training a xgboost and used AUC as metric to evaluate the performance, I notice first several rounds' AUC score is always 0.5. Basically it means the first several trees did not learn anything:
Multiple eval metrics have been passed: 'eval-auc' will be used for early stopping.
Will train until eval-auc hasn't improved in 20 rounds.
[0] train-auc:0.5 eval-auc:0.5
[1] train-auc:0.5 eval-auc:0.5
[2] train-auc:0.5 eval-auc:0.5
[3] train-auc:0.5 eval-auc:0.5
[4] train-auc:0.5 eval-auc:0.5
[5] train-auc:0.5 eval-auc:0.5
[6] train-auc:0.5 eval-auc:0.5
[7] train-auc:0.5 eval-auc:0.5
[8] train-auc:0.5 eval-auc:0.5
[9] train-auc:0.5 eval-auc:0.5
[10] train-auc:0.5 eval-auc:0.5
[11] train-auc:0.5 eval-auc:0.5
[12] train-auc:0.5 eval-auc:0.5
[13] train-auc:0.5 eval-auc:0.5
[14] train-auc:0.537714 eval-auc:0.51776
[15] train-auc:0.541722 eval-auc:0.521087
[16] train-auc:0.555587 eval-auc:0.527019
[17] train-auc:0.669665 eval-auc:0.632106
[18] train-auc:0.6996 eval-auc:0.651677
[19] train-auc:0.721472 eval-auc:0.680481
[20] train-auc:0.722052 eval-auc:0.684549
[21] train-auc:0.736386 eval-auc:0.690942
As you can see, the first 13 rounds did not learn anything.
The parameter I used: param = {'max_depth':6, 'eta':0.3, 'silent':1, 'objective':'binary:logistic'}
using xgboost 0.8
Is there anyway to prevent this?
Thanks
Upvotes: 2
Views: 737
Reputation: 529
AUC equal 0.5 during the first several rounds does not mean that XGBoost does not learn. Check if your dataset is balanced. If not, all instances (they of target=1 and target=0) try to go from default 0.5 to the target mean, e.g. 0.17 (logloss improves, learning is going on), and then reach the region where improving logloss improves AUC. IF you want to help the algorithm to reach this region, change the default value of the parameter base_score
=0.5 to the target mean.
https://xgboost.readthedocs.io/en/latest/parameter.html
Upvotes: 1