Reputation: 2978
I am trying to train a random forest, but have issues with the naming of variables:
library("randomForest")
f <- "~ var1_testTRUE + var2_root_subj. + var3_test.en-US"
rf <- randomForest(as.formula(f), data=dtrain, ntree=10, nodesize=10)
This is the error message:
Error in eval(predvars, data, env) : objeto 'var3_test.en' no encontrado
It's not clear to me why -US
is not appended to the feature name.
How to fix it?
Upvotes: 0
Views: 119
Reputation: 3726
var3_test.en-US
is a non-syntactic name, so you need to surround it with backticks. You can see that as written your formula isn't being parsed how you want:
as.formula("~ var1_testTRUE + var2_root_subj. + var3_test.en-US")
# ~var1_testTRUE + var2_root_subj. + var3_test.en - US
With backticks it gets parsed correctly:
as.formula("~ var1_testTRUE + var2_root_subj. + `var3_test.en-US`")
# ~var1_testTRUE + var2_root_subj. + `var3_test.en-US`
Upvotes: 1