moinabyssinia
moinabyssinia

Reputation: 183

Why does mean squared error decrease when the number of trees is increased in Random Forest?

I am using random forest to model a response variable. When I look at the OOB plot the Mean squared error plummets with increase in the number of trees. What is the explanation to that decrease?

Upvotes: 0

Views: 1173

Answers (1)

Abhineet Gupta
Abhineet Gupta

Reputation: 631

Generally, more trees is equivalent to more features/parameters in your model. A higher number of features in an ML model always reduces training error. This is simply due to the fact that if those additional features are unhelpful, then those features will not be used and the training error will at least remain the same as the model with fewer features.

This however does not mean that adding more features/parameters is always a good idea as a reduction in training error does not imply reduction in generalization error. In other words, your model could be overfitting on the training data but may not show error reduction on test data. A good approach to finding the ideal number of trees is to plot the test error with increase in number of trees and select that number at which the test error starts plateau-ing.

Upvotes: 0

Related Questions