Reputation: 2637
I'd like to build a logistic regression model using the caret package.
This is my code.
library(caret)
df <- data.frame(response = sample(0:1, 200, replace=TRUE), predictor = rnorm(200,10,45))
outcomeName <-"response"
predictors <- names(df)[!(names(df) %in% outcomeName)]
index <- createDataPartition(df$response, p=0.75, list=FALSE)
trainSet <- df[ index,]
testSet <- df[-index,]
model_glm <- train(trainSet[,outcomeName], trainSet[,predictors], method='glm', family="binomial", data = trainSet)
I get the error Error: Please use column names for x
.
I receive the same error when I replace trainSet[,predictors]
with the column name predictors
.
Upvotes: 1
Views: 8050
Reputation: 2863
I had the same problem,
`head(iris)
xx <- iris[,-5]
yy <- iris[,5]
rf.imp <- train(x = xx, y = yy, method = "rf", data = iris); rf.imp`
Upvotes: 0
Reputation: 596
Unfortunately R has a nasty behavior when subsetting just one column like df[,1]
to change outcome to a vector
and as you have only one predictor you encountered this feature. You can preserve results as data.frame
by either
trainSet[,predictors, drop = FALSE]
or
trainSet[predictors]
BTW. there are two additional issues with the code:
caret
you need response to be a factor
The full code should be:
library(caret)
df <- data.frame(response = sample(0:1, 200, replace=TRUE),
predictor = rnorm(200,10,45))
df$response <- as.factor(df$response)
outcomeName <-"response"
predictors <- names(df)[!(names(df) %in% outcomeName)]
index <- createDataPartition(df$response, p=0.75, list=FALSE)
trainSet <- df[ index,]
testSet <- df[-index,]
model_glm <- train(trainSet[predictors], trainSet[[outcomeName]], method='glm', family="binomial", data = trainSet)
*changed trainSet[,outcomeName]
to trainSet[[outcomeName]]
for more explicit transformation to vector
Upvotes: 4