Reputation: 192
I ran a random forest classification using the randomForest package. When it is finished, I typed summary() on my classifier and it appeared that the ntree parameter was left equal to 1, when I was told that the default value was 500, and it could be changed manually in the argument of randomForest, which I tried unsuccessfully.
I also tried it with another dataset and I had the same issue. Does anyone has any idea of what might be going on?
Upvotes: 1
Views: 525
Reputation: 93811
TL;DR: To get a summary of the model, just type the name of the model object. For example, if the model object is rf1
type rf1
, not summary(rf1)
.
Most packages have a summary "method" that gets dispatched when you run summary
on an object produced by the package. But in the case of randomForest
there doesn't seem to be a summary method. The output of randomForest
is a list containing a bunch of model output. When you run summary
on it, it just runs the default summary
function, which returns the length of each list element, which is not very useful here.
Thus, in this case, when you run summary
on your randomForest
model object, you're seeing a value of 1 for ntree
, because ntree
is an element of the list returned by randomForest
and it is a vector of length 1. (Note that the column name of the summary
output is Length
.)
To see a summary of model results, just type the name of your model object and this will cause an actual summary to be printed to the console. For example, if your model object is called rf1
, just type rf1
, not summary(rf1)
. Typing the object name causes the print.randomForest
method to be dispatched, and this does provide a summary of the randomForest
results, including ntree
.
If you want to extract the value of ntree
or other results from your model, run str(rf1)
to see the structure of the list returned by randomForest
and also look at the help for randomForest
for additional information on what's in this list. For example, rf1$ntree
would return the number of trees in the model.
Upvotes: 3