Reputation: 8562
Using the famous Iris data set with Julia decision tree classifier I get the following tree.
using RDatasets
using DecisionTree
iris = dataset("datasets", "iris")
features = convert(Array, iris[:, 1:4])
labels = convert(Array, iris[:, 5]);
model = build_tree(labels, features)
model = prune_tree(model, 0.9)
print_tree(model)
Feature 3, Threshold 3.0
L-> setosa : 50/50
R-> Feature 4, Threshold 1.8
L-> Feature 3, Threshold 5.0
L-> versicolor : 47/48
R-> Feature 4, Threshold 1.6
L-> virginica : 3/3
R-> Feature 1, Threshold 7.2
L-> versicolor : 2/2
R-> virginica : 1/1
R-> Feature 3, Threshold 4.9
L-> Feature 1, Threshold 6.0
L-> versicolor : 1/1
R-> virginica : 2/2
R-> virginica : 43/43
I can't really interpret the numbers after some of the branches, like "setosa : 50/50" or "virginica : 3/3".
Could somebody explain what those mean?
Upvotes: 4
Views: 124
Reputation: 185
It looks like that on node "setosa : 50/50" 50 flowers was classified correctly (50 flowers was turned into this node and 50 are setosa) versicolor : 47/48 means that one of them is virginica or setosa.
Upvotes: 2