Reputation: 21
I'm building a simple binary decision tree in Python. I'm using recursion to build the tree, but, being a person without a firm grasp on the concept, I'm having some trouble. I want to stop the recursion when the tree gets to a certain depth, but I'm not sure where to increment in the value so it builds more than one branch at a time. Right now, the tree just branches to the right until it hits 5, then it stops. Where/how should I increment the value? Right now I'm doing it in the for loop at the bottom of the function.
def buildTree(currentNode, maxDepth, currentDepth, minGain, currentGain, allFeatures):
print(currentNode.data)
if maxDepth <= currentDepth:
return None
else:
splitOn, hasFeat, noFeat, allFeatures, maxGain = split(currentNode, allFeatures)
print(len(hasFeat), len(noFeat))
currentNode.left = Tree()
currentNode.left.vectors = hasFeat
currentNode.left.data = splitOn
currentNode.left.entropy = getEntropy(getInstances(currentNode.left.vectors))
currentNode.right = Tree()
currentNode.right.vectors = noFeat
currentNode.right.data = "!" + splitOn
currentNode.right.entropy = getEntropy(getInstances(currentNode.right.vectors))
nodeList = [currentNode.right, currentNode.left]
for node in nodeList:
return buildTree(node, maxDepth, currentDepth + 1, minGain, maxGain, allFeatures)
Upvotes: 1
Views: 513
Reputation: 53565
The recursive call should be used on currentNode.left
and currentNode.right
.
The code should be something like:
def buildTree(currentNode, maxDepth, currentDepth, minGain, currentGain, allFeatures):
print(currentNode.data)
if maxDepth <= currentDepth:
return None
else:
splitOn, hasFeat, noFeat, allFeatures, maxGain = split(currentNode, allFeatures)
print(len(hasFeat), len(noFeat))
currentNode.left = buildTree(Tree(), maxDepth, currentDepth + 1, minGain, maxGain, allFeatures)
currentNode.left.vectors = hasFeat
currentNode.left.data = splitOn
currentNode.left.entropy = getEntropy(getInstances(currentNode.left.vectors))
currentNode.right = buildTree(Tree(), maxDepth, currentDepth + 1, minGain, maxGain, allFeatures)
currentNode.right.vectors = noFeat
currentNode.right.data = "!" + splitOn
currentNode.right.entropy = getEntropy(getInstances(currentNode.right.vectors))
nodeList = [currentNode.right, currentNode.left]
return nodeList
Upvotes: 2
Reputation: 3387
Your function can only return once, which is why it's only building the right node; it also shouldn't be returning its child node. Have you tried replacing the last lines with:
for node in nodeList:
node = buildTree(node, maxDepth, currentDepth + 1, minGain, maxGain, allFeatures)
return currentNode
Upvotes: 0