MainstreamDeveloper00
MainstreamDeveloper00

Reputation: 8556

C4.5 and ID3 algorithms with emphasis on practical details

I began to apply data mining algorithms. Now I study decision trees. There a lot of material across the Internet about C4.5 and ID3 algorithms, but I want to know practical details, pros and cons and some technical niceties of these two algorithms. If there is a link to such a material I will be glad

Upvotes: 0

Views: 2647

Answers (1)

bogatron
bogatron

Reputation: 19159

Two pros for decision trees are that they are able to handle noisy data and they provide an intuitive interpretation of the data (you can easily see which attributes are considered most important by the tree). A con is that they are greedy algorithms (they select branching attributes without consideration of how that affects final classification accuracy) so they don't necessarily yield an optimal tree structure. Decision trees are easily incorporated in to ensemble methods, such as random forests.

C4.5 is an improvement of ID3, making it able to handle real-valued attributes (ID3 uses categorical attributes) and missing attributes. There are many description of both algorithms on the internet. Wikipedia has descriptions of both ID3 and C4.5. For another description of both algorithms, you might start here.

Upvotes: 3

Related Questions