human torch
human torch

Reputation: 303

WEKA difference between output of J48 and ID3 algorithm

I have a data set which I am classifying in WEKA using J48 and ID3 algorithm. The output of J48 algorithm is:

Correctly Classified Instances          73               92.4051 %

Incorrectly Classified Instances         6                7.5949 %

Kappa statistic                          0.8958

Mean absolute error                      0.061 

Root mean squared error                  0.1746

Relative absolute error                 16.7504 %

Root relative squared error             40.9571 %

Total Number of Instances               79     

and the output using ID3 is:

Correctly Classified Instances          79              100      %

Incorrectly Classified Instances         0                0      %

Kappa statistic                          1     

Mean absolute error                      0     

Root mean squared error                  0     

Relative absolute error                  0      %

Root relative squared error              0      %

Total Number of Instances               79 

My question is, if J48 is an extension of ID3 and is newer compared to it, how come ID3 is giving a better result than J48?

Upvotes: 0

Views: 10117

Answers (2)

Rupesh Kamble
Rupesh Kamble

Reputation: 167

Decision trees are more likely to face problem of Data over-fitting , In your case ID3 algorithm is facing the issue of data over-fitting. This is the problem of Decision trees ,that it splits the data until it make pure sets. This Problem of Data over-fitting is fixed in it's extension that is J48 by using Pruning.

Another point to cover : You should use K-fold Cross validation for Validating your Model.

Upvotes: 0

Zegad
Zegad

Reputation: 310

The J48 model is more accurate in the quality in the process, based in C4.5 is an extension of ID3 that accounts for unavailable values, continuous attribute value ranges, pruning of decision trees, rule derivation, and so on. The result in this case is only reflect of the kind of your data set you used. The ID3 could be implemented when you need more faster/simpler result without taking into account all those additional factors in the J48 consider. Take a Look into pruning decision tree and deriving rule sets HERE In the web are a lot of resource in the theme about these comparatives results its more important to learn to identified in which case we apply the different classifier once we know how each one work(1)

Upvotes: 1

Related Questions