Reputation: 441
This class is used to build a decision tree. A lot of values within the trees structure including tree_.value and tree_.impurity are saved as arrays without much indication to which node each value is referring to. My deductions tell me that they are doing a preorder traversal, but I have no conclusive proof that this is how each array is being constructed. Does anyone know where to find this information?
Upvotes: 0
Views: 262
Reputation: 2129
From tree.pyx:
def class Tree:
"""Array-based representation of a binary decision tree.
The binary tree is represented as a number of parallel arrays. The i-th
element of each array holds information about the node `i`. Node 0 is the
tree's root. You can find a detailed description of all arrays in
`_tree.pxd`. NOTE: Some of the arrays only apply to either leaves or split
nodes, resp. In this case the values of nodes of the other type are
arbitrary!
So node[0] refers to the root. To add a node, it uses a splitter class on the leaves, which can either split the nodes based on the leaf that has a greater impurity improvement, or a depth-first splitting. I haven't looked into the order nodes are added to the parallel, but my guess is that they are added in the order in which they are created, which would be "similar" to pre-order transversal if the tree were ordered by impurity improvement.
Upvotes: 1