Reputation: 737
I can't understand the output of
kfold_results = cross_val_score(xg_cl, X_train, y_train, cv=kfold, scoring='roc_auc')
The output of xgb.cv is clear - there are the train and test scores:
[0] train-auc:0.927637+0.00405497 test-auc:0.788526+0.0152854
[1] train-auc:0.978419+0.0018253 test-auc:0.851634+0.0201297
[2] train-auc:0.985103+0.00191355 test-auc:0.86195+0.0164157
[3] train-auc:0.988391+0.000999448 test-auc:0.870363+0.0161025
[4] train-auc:0.991542+0.000756701 test-auc:0.881663+0.013579
But the result of cross_val_score in Sk-learn wrapper is umbiguous: it is a list of scores after each fold, but: -whether the result of test_data or of train_data?
Upvotes: 2
Views: 243
Reputation: 18367
Kfold splits the data in the number of folds
being passed, Changed in version 0.20: cv default value if None will change from 3-fold to 5-fold in v0.22.
from sklearn. So what it does is split the dataset in 5 subsets (default for version 0.22), uses 4 as train, and 1 as validation. Therefore the output is an array of 5 items, 1 for each iteration. This is what it would look like:
Upvotes: 1