XGBoost algorithm, question about the evaulate_model function

Question

This evaulate model function is frequently used, I found it used here at IBM. But I will show the function here:

def evaluate_model(alg, train, target, predictors, useTrainCV=True , cv_folds=5, early_stopping_rounds=50):

    if useTrainCV:
        xgb_param = alg.get_xgb_params()
        xgtrain = xgb.DMatrix(train[predictors].values, target['Default Flag'].values)
        cvresult = xgb.cv(xgb_param, xgtrain, num_boost_round=alg.get_params()['n_estimators'], nfold=cv_folds,
            metrics='auc', early_stopping_rounds=early_stopping_rounds, verbose_eval=True)
        alg.set_params(n_estimators=cvresult.shape[0])

    #Fit the algorithm on the data
    alg.fit(train[predictors], target['Default Flag'], eval_metric='auc')

    #Predict training set:
    dtrain_predictions = alg.predict(train[predictors])
    dtrain_predprob = alg.predict_proba(train[predictors])[:,1]

    #Print model report:
    print("
Model Report")
    print("Accuracy : %.6g" % metrics.accuracy_score(target['Default Flag'].values, dtrain_predictions))
    print("AUC Score (Train): %f" % metrics.roc_auc_score(target['Default Flag'], dtrain_predprob))  
    plt.figure(figsize=(12,12))
    feat_imp = pd.Series(alg.get_booster().get_fscore()).sort_values(ascending=False)
    feat_imp.plot(kind='bar', title='Feature Importance', color='g')
    plt.ylabel('Feature Importance Score')
    plt.show()

After tuning the parameters for XGboost, I have

xgb4 = XGBClassifier(
    objective="binary:logistic", 
    learning_rate=0.10,  
    n_esimators=5000,
    max_depth=6,
    min_child_weight=1,
    gamma=0.1,
    subsample=0.8,
    colsample_bytree=0.8,
    reg_alpha=0.1,
    nthread=4,
    scale_pos_weight=1.0,
    seed=27)
features = [x for x in X_train.columns if x not in ['Default Flag','ID']]
evaluate_model(xgb4, X_train, y_train, features)

and the results I get is

Model Report
Accuracy : 0.803236
AUC Score (Train): 0.856995

The question I have and perhaps ill-informed is that this evaulate_model() function is not tested on the test set of the data which I found odd. When I do call it on the test set (evaluate_model(xgb4, X_test, y_test, features)) I get this

Model Report
Accuracy : 0.873706
AUC Score (Train): 0.965286

I want to know if these two Model Reports are concerning at all given that the test set has a higher accuracy then the training set. My apologies if the structure of this question is poorly presented.

CoMartel · Accepted Answer

I will develop my answer a little bit more :

This function train on the dataset you give it, and return the train accuracy and AUC : this is therefore not a reliable way to evaluate your models.

In the link you provided, it is said that this function is used to tune the number of estimators:

The function below performs the following actions to find the best number of boosting trees to use on your data:

Trains an XGBoost model using features of the data.

Performs k-fold cross validation on the model, using accuracy and AUC score as the evaluation metric.

Returns output for each boosting round so you can see how the model is learning. You will look at the detailed output in the next
section.

It stops running after the cross-validation score does not improve significantly with additional boosting rounds, giving you an
optimal number of estimators for the model.

You should not use it to evaluate your model performance, but rather perform a clean cross validation.

Your test scores are higher in this case because your test set is smaller, so the model overfit more easily.

XGBoost algorithm, question about the evaulate_model function

Answers (1)

Related Questions