Reputation: 4917
There are a couple of algorithms to build decision trees such as CART (Classification and Regression Trees), ID3 (Iterative Dichotomiser 3) etc
Which decision tree algorithm does scikit-learn use by default?
When I look at some decision-tree python scripts, it magically produces the results with fit
and predict
functions.
Does scikit-learn cleverly choose the best decision-tree algo based on the data?
Upvotes: 2
Views: 1934
Reputation: 21
They actually use CART but the splitting criterion is both gini and entropy. You can look into How to tune a Decision Tree? for explanations of what happens under the hood.
Upvotes: 2
Reputation: 3586
It doesn't automatically do so.
If we look at the sklearn.tree.DecisionTreeClassifier page, we can see that the default criteria is gini impurity.
There is also an option to use entropy instead for the criterion.
Note that CART uses gini impurity and ID3 uses entropy as splitting criteria.
Upvotes: 3