Reputation: 2978
I try to train the random forest as follows:
library(caret)
library(randomForest)
nfields <- 5
control <- rfeControl(functions = rfFuncs,
method = "repeatedcv",
repeats = 1,
verbose = TRUE)
fields <- colnames(dtrain)[!colnames(dtrain) %in% "my_target"]
predictors_rfe <- rfe(dtrain[,fields,with=F], dtrain$my_target,
rfeControl = control)
Random forests's execution:
+(rfe) fit Fold01.Rep1 size: 120
-(rfe) fit Fold01.Rep1 size: 120
+(rfe) imp Fold01.Rep1
-(rfe) imp Fold01.Rep1
+(rfe) fit Fold01.Rep1 size: 16
+(rfe) fit Fold02.Rep1 size: 120
-(rfe) fit Fold02.Rep1 size: 120
+(rfe) imp Fold02.Rep1
-(rfe) imp Fold02.Rep1
+(rfe) fit Fold02.Rep1 size: 16
-(rfe) fit Fold02.Rep1 size: 16
+(rfe) fit Fold02.Rep1 size: 8
-(rfe) fit Fold02.Rep1 size: 8
+(rfe) fit Fold02.Rep1 size: 4
-(rfe) fit Fold02.Rep1 size: 4
+(rfe) fit Fold03.Rep1 size: 120
-(rfe) fit Fold03.Rep1 size: 120
+(rfe) imp Fold03.Rep1
# ...
+(rfe) fit Fold10.Rep1 size: 16
-(rfe) fit Fold10.Rep1 size: 16
+(rfe) fit Fold10.Rep1 size: 8
-(rfe) fit Fold10.Rep1 size: 8
+(rfe) fit Fold10.Rep1 size: 4
-(rfe) fit Fold10.Rep1 size: 4
Then I get the error:
Error in { : task 1 failed - "undefined columns selected"
From the error message I cannot understand what is wrong… Could anybody help please?
I found out from here that it's a bug of caret
. But this bug was reported and solved in 2016... I use the latest version of caret
Upvotes: 1
Views: 1083
Reputation: 1599
I made an example using iris
and following the caret tutorial. Probably your error is in:
dtrain [, fields, with = F]
See the example below using iris
:
set.seed(1)
library(caret)
nfields <- 5
control <- rfeControl(functions = rfFuncs,
method = "repeatedcv",
repeats = 1,
verbose = F)
irisx <- iris[,1:4]
fields <- colnames(irisx)[!colnames(irisx) %in% "Petal.Width"]
predictors_rfe <- rfe(irisx[,fields],
irisx$Petal.Width,
rfeControl = control)
predictors_rfe
> predictors_rfe
Recursive feature selection
Outer resampling method: Cross-Validated (10 fold, repeated 1 times)
Resampling performance over subset size:
Variables RMSE Rsquared MAE RMSESD RsquaredSD MAESD Selected
3 0.196 0.9418 0.1519 0.03502 0.0177 0.02608 *
The top 3 variables (out of 3):
Petal.Length, Sepal.Length, Sepal.Width
If you can provide a reproducible example with your dataset, I will be able to better check the possible error.
Upvotes: 2