Reputation: 458
This evaulate model function is frequently used, I found it used here at IBM. But I will show the function here:
def evaluate_model(alg, train, target, predictors, useTrainCV=True , cv_folds=5, early_stopping_rounds=50):
if useTrainCV:
xgb_param = alg.get_xgb_params()
xgtrain = xgb.DMatrix(train[predictors].values, target['Default Flag'].values)
cvresult = xgb.cv(xgb_param, xgtrain, num_boost_round=alg.get_params()['n_estimators'], nfold=cv_folds,
metrics='auc', early_stopping_rounds=early_stopping_rounds, verbose_eval=True)
alg.set_params(n_estimators=cvresult.shape[0])
#Fit the algorithm on the data
alg.fit(train[predictors], target['Default Flag'], eval_metric='auc')
#Predict training set:
dtrain_predictions = alg.predict(train[predictors])
dtrain_predprob = alg.predict_proba(train[predictors])[:,1]
#Print model report:
print("\nModel Report")
print("Accuracy : %.6g" % metrics.accuracy_score(target['Default Flag'].values, dtrain_predictions))
print("AUC Score (Train): %f" % metrics.roc_auc_score(target['Default Flag'], dtrain_predprob))
plt.figure(figsize=(12,12))
feat_imp = pd.Series(alg.get_booster().get_fscore()).sort_values(ascending=False)
feat_imp.plot(kind='bar', title='Feature Importance', color='g')
plt.ylabel('Feature Importance Score')
plt.show()
After tuning the parameters for XGboost, I have
xgb4 = XGBClassifier(
objective="binary:logistic",
learning_rate=0.10,
n_esimators=5000,
max_depth=6,
min_child_weight=1,
gamma=0.1,
subsample=0.8,
colsample_bytree=0.8,
reg_alpha=0.1,
nthread=4,
scale_pos_weight=1.0,
seed=27)
features = [x for x in X_train.columns if x not in ['Default Flag','ID']]
evaluate_model(xgb4, X_train, y_train, features)
and the results I get is
Model Report
Accuracy : 0.803236
AUC Score (Train): 0.856995
The question I have and perhaps ill-informed is that this evaulate_model()
function is not tested on the test set of the data which I found odd. When I do call it on the test set (evaluate_model(xgb4, X_test, y_test, features)
) I get this
Model Report
Accuracy : 0.873706
AUC Score (Train): 0.965286
I want to know if these two Model Reports are concerning at all given that the test set has a higher accuracy then the training set. My apologies if the structure of this question is poorly presented.
Upvotes: 2
Views: 541
Reputation: 3591
I will develop my answer a little bit more :
This function train on the dataset you give it, and return the train accuracy and AUC : this is therefore not a reliable way to evaluate your models.
In the link you provided, it is said that this function is used to tune the number of estimators:
The function below performs the following actions to find the best number of boosting trees to use on your data:
- Trains an XGBoost model using features of the data.
- Performs k-fold cross validation on the model, using accuracy and AUC score as the evaluation metric.
- Returns output for each boosting round so you can see how the model is learning. You will look at the detailed output in the next
section.- It stops running after the cross-validation score does not improve significantly with additional boosting rounds, giving you an
optimal number of estimators for the model.
You should not use it to evaluate your model performance, but rather perform a clean cross validation.
Your test scores are higher in this case because your test set is smaller, so the model overfit more easily.
Upvotes: 1