Reputation: 2742
In the toy example below, I converted the variable name cyl
to 1_cyl. I am doing this as in my actual data there are some variables that starts with a number. I am applying randomForest using that formula but I am getting the error shown below. I see that another functions work perfect with the same formula.
How can I sove this problem?
data(mtcars)
colnames(mtcars)[2] = '1_cyl'
colnames(mtcars)
#[1] "mpg" "1_cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear" "carb" ]
(fmla <- as.formula(paste("mpg ~ `1_cyl`+hp ")) )
randomForest(fmla, dat=mtcars,importance=T,na.action=na.exclude)
#> randomForest(fmla, dat=mtcars,importance=T,na.action=na.exclude)
#Error in eval(expr, envir, enclos) : object '1_cyl' not found
#Another functions works!!!
rpart(fmla, dat=mtcars)
glm (fmla, dat=mtcars)
Upvotes: 0
Views: 4105
Reputation: 57686
randomForest.formula
has a call inside it to reformulate
, for some reason, and it looks like that function doesn't like nonstandard names. (It's also calling model.frame
twice.)
You can get around this by calling randomForest
without a formula, but with a model matrix and response variable. When you use a formula this is what happens anyway; randomForest.formula
is just a convenience wrapper that builds the model matrix for you.
randomForest(mtcars[, -1], mtcars[, 1])
Upvotes: 2