Reputation: 737
I plotted a tree and in the end of the trees (in the leaves) there are shown some values. What do they mean?
# model parameters
colsample_bytree = 0.4
objective = 'binary:logistic'
learning_rate = 0.05
eval_metric = 'auc'
max_depth = 8
min_child_weight = 4
n_estimators = 5000
seed = 7
# create and train model
bst = xgb.train(param,
dtrain,
num_boost_round = best_iteration)
dot = xgb.to_graphviz(bst, rankdir='LR')
dot.render("trees1")
I thought, it is a predicted proba score, but the leaves' values' range is up to .01. Whereas predicted proba score' range is up to 1. May be, it means predicted proba' score divided by 10 (e.g. leaf value = 0.01 means that predicted proba = 0.1)?
And why do some leaves have negative values (e.g. -0.01)? Thank you.
Upvotes: 0
Views: 304
Reputation: 392
The value of a leaf is your "eval_metric", local to your split :). For you it is the AUC.
Here are all attributes of a tree :
n_nodes = estimator.tree_.node_count
children_left = estimator.tree_.children_left
children_right = estimator.tree_.children_right
feature = estimator.tree_.feature
threshold = estimator.tree_.threshold
Can't find it in the doc but "tree_.impurity" does exist aswell.
Upvotes: 1