Kevin
Kevin

Reputation: 6833

The effect of Decision Tree Pruning

I want to know if I build up a decision tree A like ID3 from training and validation set,but A is unpruned. At the same time,I have another decision tree B also in ID3 generated from the same training and validation set,but B is pruned. Now I test both A and B on a future unlabeled test set,is it always the case that pruned tree will perform better? Any idea is welcomed,thanks.

Upvotes: 3

Views: 3456

Answers (4)

Thilina Samiddhi
Thilina Samiddhi

Reputation: 326

I agree with 1st answer by @AMRO. Post-pruning is the most common approach for decision tree pruning and it is done after the tree is built. But, Pre-pruning can also be done. in pre-pruning, a tree is pruned by halting its construction early, by using a specified threshold value. For example, by deciding not to split the subset of training tuples at a given node.

Then that node becomes a leaf. This leaf may hold the most frequent class among the subset of tuples or the probability of those tuples.

Upvotes: 0

Jacob
Jacob

Reputation: 34601

Pruning is supposed to improve classification by preventing overfitting. Since pruning will only occur if it improves classification rates on the validation set, a pruned tree will perform as well or better than an un-pruned tree during validation.

Upvotes: 1

Amro
Amro

Reputation: 124543

I think we need to make the distinction clearer: pruned trees always perform better on the validation set, but not necessarily so on the testing set (in fact it is also of equal or worse performance on the training set). I am assuming that the pruning is done after the tree is built (ie: post-pruning)..

Remember that the whole reason of using a validation set is to avoid overfitting over the training dataset, and the key point here is generalization: we want a model (decision tree) that generalizes beyond the instances that have been provided at "training time" to new unseen examples.

Upvotes: 3

smilingthax
smilingthax

Reputation: 5724

Bad pruning can lead to wrong results. Although a reduced decision tree size is often desired, you usually aim for better results when pruning. Therefore the how is the crux of the pruning.

Upvotes: 0

Related Questions