ArinB
ArinB

Reputation: 21

sklearn decision tree classifier: How to control max number of branches of each split

I am trying to code a two class classification DT problem that I used SAS EM before. But trying to do it in Sklearn. The target variable is a two class categorical variable. But there are a few continuous independent variables. In SAS I could specify the "Maximum Number of Branches" for each split. So when it is set to 4, some leaf will split into 2 and some in 4 (especially for continuous variables). I could not find an equivalent parameter in sklearn. Looked at "max_leaf-nodes". But that controls the total number of "leaf" nodes of the entire tree. I am sure some of you probably has faced the same situation and already found a solution. Please help/share. I will really appreciate it.

Upvotes: 2

Views: 2311

Answers (1)

momo1644
momo1644

Reputation: 1804

I don't think this option is available in sklearn, You will find this Post very useful for your Classification DT; as it lists all the options you have available.

I would recommend creating Bins for your continues variables; this way you force the branches to be the number of bins you have.

Example: For continuous variable COl1 has values between 1-100; you can create a 4 bins 1-25, 26-50 , 51-75, 76-100. or you can create the bins bases on the median.

Upvotes: 1

Related Questions