user3499704
user3499704

Reputation: 21

Recursion in trees

I'm building a simple binary decision tree in Python. I'm using recursion to build the tree, but, being a person without a firm grasp on the concept, I'm having some trouble. I want to stop the recursion when the tree gets to a certain depth, but I'm not sure where to increment in the value so it builds more than one branch at a time. Right now, the tree just branches to the right until it hits 5, then it stops. Where/how should I increment the value? Right now I'm doing it in the for loop at the bottom of the function.

def buildTree(currentNode, maxDepth, currentDepth, minGain, currentGain, allFeatures):
print(currentNode.data)
if maxDepth <= currentDepth:
    return None
else:
    splitOn, hasFeat, noFeat, allFeatures, maxGain = split(currentNode, allFeatures)
    print(len(hasFeat), len(noFeat))
    currentNode.left = Tree()
    currentNode.left.vectors = hasFeat
    currentNode.left.data = splitOn
    currentNode.left.entropy = getEntropy(getInstances(currentNode.left.vectors))
    currentNode.right = Tree()
    currentNode.right.vectors = noFeat
    currentNode.right.data = "!" + splitOn
    currentNode.right.entropy = getEntropy(getInstances(currentNode.right.vectors))
    nodeList = [currentNode.right, currentNode.left]
    for node in nodeList:
        return buildTree(node, maxDepth, currentDepth + 1, minGain, maxGain, allFeatures)

Upvotes: 1

Views: 513

Answers (2)

Nir Alfasi
Nir Alfasi

Reputation: 53565

The recursive call should be used on currentNode.left and currentNode.right.

The code should be something like:

def buildTree(currentNode, maxDepth, currentDepth, minGain, currentGain, allFeatures):
    print(currentNode.data)
    if maxDepth <= currentDepth:
        return None
    else:
        splitOn, hasFeat, noFeat, allFeatures, maxGain = split(currentNode, allFeatures)
        print(len(hasFeat), len(noFeat))
        currentNode.left = buildTree(Tree(), maxDepth, currentDepth + 1, minGain, maxGain, allFeatures)
        currentNode.left.vectors = hasFeat
        currentNode.left.data = splitOn
        currentNode.left.entropy = getEntropy(getInstances(currentNode.left.vectors))
        currentNode.right = buildTree(Tree(), maxDepth, currentDepth + 1, minGain, maxGain, allFeatures)
        currentNode.right.vectors = noFeat
        currentNode.right.data = "!" + splitOn
        currentNode.right.entropy = getEntropy(getInstances(currentNode.right.vectors))
        nodeList = [currentNode.right, currentNode.left]
        return nodeList

Upvotes: 2

Daniel Robinson
Daniel Robinson

Reputation: 3387

Your function can only return once, which is why it's only building the right node; it also shouldn't be returning its child node. Have you tried replacing the last lines with:

for node in nodeList:
    node = buildTree(node, maxDepth, currentDepth + 1, minGain, maxGain, allFeatures)
return currentNode

Upvotes: 0

Related Questions