Reputation: 303
I have a data set which I am classifying in WEKA
using J48
and ID3
algorithm. The output of J48
algorithm is:
Correctly Classified Instances 73 92.4051 %
Incorrectly Classified Instances 6 7.5949 %
Kappa statistic 0.8958
Mean absolute error 0.061
Root mean squared error 0.1746
Relative absolute error 16.7504 %
Root relative squared error 40.9571 %
Total Number of Instances 79
and the output using ID3 is:
Correctly Classified Instances 79 100 %
Incorrectly Classified Instances 0 0 %
Kappa statistic 1
Mean absolute error 0
Root mean squared error 0
Relative absolute error 0 %
Root relative squared error 0 %
Total Number of Instances 79
My question is, if J48
is an extension of ID3
and is newer compared to it, how come ID3
is giving a better result than J48
?
Upvotes: 0
Views: 10117
Reputation: 167
Decision trees are more likely to face problem of Data over-fitting , In your case ID3 algorithm is facing the issue of data over-fitting. This is the problem of Decision trees ,that it splits the data until it make pure sets. This Problem of Data over-fitting is fixed in it's extension that is J48 by using Pruning.
Another point to cover : You should use K-fold Cross validation for Validating your Model.
Upvotes: 0
Reputation: 310
The J48 model is more accurate in the quality in the process, based in C4.5 is an extension of ID3 that accounts for unavailable values, continuous attribute value ranges, pruning of decision trees, rule derivation, and so on. The result in this case is only reflect of the kind of your data set you used. The ID3 could be implemented when you need more faster/simpler result without taking into account all those additional factors in the J48 consider. Take a Look into pruning decision tree and deriving rule sets HERE In the web are a lot of resource in the theme about these comparatives results its more important to learn to identified in which case we apply the different classifier once we know how each one work(1)
Upvotes: 1