Alex Ivanov
Alex Ivanov

Reputation: 737

What is the meaning of the value of the boosted tree?

I plotted a tree and in the end of the trees (in the leaves) there are shown some values. What do they mean?

# model parameters
colsample_bytree = 0.4
objective = 'binary:logistic'
learning_rate = 0.05
eval_metric = 'auc'
max_depth = 8
min_child_weight = 4
n_estimators = 5000
seed = 7

# create and train model
bst = xgb.train(param, 
                dtrain, 
                num_boost_round = best_iteration)

dot = xgb.to_graphviz(bst, rankdir='LR')
dot.render("trees1")

I thought, it is a predicted proba score, but the leaves' values' range is up to .01. Whereas predicted proba score' range is up to 1. May be, it means predicted proba' score divided by 10 (e.g. leaf value = 0.01 means that predicted proba = 0.1)?

And why do some leaves have negative values (e.g. -0.01)? Thank you.

Part of the tree

Upvotes: 0

Views: 304

Answers (1)

Gwendal Yviquel
Gwendal Yviquel

Reputation: 392

The value of a leaf is your "eval_metric", local to your split :). For you it is the AUC.

Here are all attributes of a tree :

n_nodes = estimator.tree_.node_count
children_left = estimator.tree_.children_left
children_right = estimator.tree_.children_right
feature = estimator.tree_.feature
threshold = estimator.tree_.threshold

From doc : https://scikit-learn.org/stable/auto_examples/tree/plot_unveil_tree_structure.html#sphx-glr-auto-examples-tree-plot-unveil-tree-structure-py

Can't find it in the doc but "tree_.impurity" does exist aswell.

Upvotes: 1

Related Questions