Reputation: 11
I'm using randomforest to analyze a training set of 600 rows of 21 variables.
# Construct Random Forest Model
rfmodel <- randomForest(default ~ .,
data = train.df,
ntree = 500,
mtry = 4,
importance = TRUE,
LocalImp = TRUE,
replace = FALSE)
print(rfmodel)
This generates the following:
> rfmodel <- randomForest(default ~ .,
+ data = train.df,
+ ntree = 500,
+ mtry = 4,
+ importance = TRUE,
+ LocalImp = TRUE,
+ replace = FALSE)
> Warning message:
> In randomForest.default(m, y, ...) :
> The response has five or fewer unique values. Are you sure you want to do
> regression?
> print(rfmodel)
>Call:
randomForest(formula = default ~ ., data = train.df, ntree = 500, mtry = 4, importance = TRUE, LocalImp = TRUE, replace = FALSE)
Type of random forest: regression
Number of trees: 500
No. of variables tried at each split: 4
Mean of squared residuals: 0.1577596
% Var explained: 23.89
This is missing the confusion matrix for some reason. When I try to generate the err.rate, it gives me this:
head(rfmodel$err.rate)
NULL
Upvotes: 1
Views: 951
Reputation: 37641
I think that you want to do classification, but default is being treated as a numeric variable. try class(train.df$default)
. If that is in fact a numeric variable, you will need to convert it to a factor before running RF.
Upvotes: 1