Reputation: 35
there has been a similar question to mine 6 years+ ago and it hasn't been solve (R -- Can I apply the train function in caret to a list of data frames?) This is why I am bringing up this topic again.
I'm writing my own functions for my big R project at the moment and I'm wondering if there is an opportunity to sum up the model training function train()
of the pakage caret
for different dataframes with different predictors.
My function should look like this:
lda_ex <- function(data, predictor){
model <- train(predictor ~., data,
method = "lda",
trControl = trainControl(method = "none"),
preProc = c("center","scale"))
return(model)
}
Using it afterwards should work like this:
data_iris <- iris
predictor_iris <- "Species"
iris_res <- lda_ex(data = data_iris, predictor = predictor_iris)
Unfortunately the R formula is not able to deal with a variable as input as far as I tried.
Is there something I am missing? Thank you in advance for helping me out!
Solving this would help me A LOT to keep my function sheet clean and safe work for sure.
Upvotes: 3
Views: 176
Reputation: 552
By writing predictor_iris <- "Species"
, you are basically saving a string object in predictor_iris
. Thus, when you run lda_ex
, I guess you incur in some error concerning the formula
object in train()
, since you are trying to predict a string using vectors of covariates.
Indeed, I tried the following toy example:
X = rnorm(1000)
Y = runif(1000)
predictor = "Y"
lm(predictor ~ X)
which gives an error about differences in the lengths of variables.
Let me modify your function:
lda_ex <- function(data, formula){
model <- train(formula, data,
method = "lda",
trControl = trainControl(method = "none"),
preProc = c("center","scale"))
return(model)
}
The key difference is that now we must pass in the whole formula
, instead of the predictor only. In that way, we avoid the string-related problem.
library(caret) # Recall to specify the packages needed to reproduce your examples!
data_iris <- iris
formula_iris = Species ~ . # Key difference!
iris_res <- lda_ex(data = data_iris, formula = formula_iris)
Upvotes: 2