Reputation: 175
Is there a canonical way to iteratively add trees to a random forest?
Let's say I am using the caret
package and I use something like
rf_fit <- train(y~.,data=df,method="rf",ntree = N)
for some N and then I would like to continue adding trees to it, how would I go about that?
Upvotes: 0
Views: 208
Reputation: 2584
You could create your own function and lapply
across ntree
:
data <- iris
fit_tree <- function(ntree){
rf_fit <- train(Species~.,data=iris,method="rf",ntree = ntree)
}
fit <- lapply(seq(100,500,100),fit_tree)
Here fit
is a list of 5 randomForests
model each fitted with the ntrees
specified in the first argument of lapply
.
I don't know if is possible to add trees to the same model. If the model fitted with n trees don't reach the accuracy you want, you can simply re-fit the model with n+100 trees for example (but keep in mind that increasing the number of trees doesn't necessarily improve the accuracy, indeed, it could worsen performance. In the caret package
the default ntrees
is 500 as suggested by Breiman in his original paper (Breiman, 2001)).
EDIT
to add trees to an existing randomForests
model you can do something like this:
fit_tree <- function(how.many){
rf_fit <- randomForest(Species~.,data=iris)
new_fit <- grow(rf_fit,how.many)
}
p <-lapply(seq(10,100,10),fit_tree)
Here the starting ntree
is the default one (I. e. 500) and the lapply
adds 10 trees for each iteration.
With this approach is not so helpful to tune the mtry
parameter with caret
since the best parameter values would be found only for the first call to randomForest
, but not for the updated model
Upvotes: 1