Cauder
Cauder

Reputation: 2637

Error: Please use column names for `x` when using caret() for logistic regression

I'd like to build a logistic regression model using the caret package.

This is my code.

library(caret)
df <- data.frame(response = sample(0:1, 200, replace=TRUE),  predictor = rnorm(200,10,45)) 

outcomeName <-"response"
predictors <- names(df)[!(names(df) %in% outcomeName)]
index <- createDataPartition(df$response, p=0.75, list=FALSE)
trainSet <- df[ index,]
testSet <- df[-index,]

model_glm <- train(trainSet[,outcomeName], trainSet[,predictors], method='glm', family="binomial", data = trainSet)

I get the error Error: Please use column names for x.

I receive the same error when I replace trainSet[,predictors] with the column name predictors.

Upvotes: 1

Views: 8050

Answers (2)

Seyma Kalay
Seyma Kalay

Reputation: 2863

I had the same problem,

`head(iris)
xx <- iris[,-5]
yy <- iris[,5]
rf.imp <- train(x = xx, y = yy, method = "rf",  data = iris); rf.imp`

Upvotes: 0

kwiscion
kwiscion

Reputation: 596

Unfortunately R has a nasty behavior when subsetting just one column like df[,1] to change outcome to a vector and as you have only one predictor you encountered this feature. You can preserve results as data.frame by either

trainSet[,predictors, drop = FALSE]

or

trainSet[predictors]

BTW. there are two additional issues with the code:

  1. First argument should be predictors, not response
  2. For logistic regression with caret you need response to be a factor

The full code should be:

library(caret)
df <- data.frame(response = sample(0:1, 200, replace=TRUE),  
                 predictor = rnorm(200,10,45)) 

df$response <- as.factor(df$response)

outcomeName <-"response"
predictors <- names(df)[!(names(df) %in% outcomeName)]
index <- createDataPartition(df$response, p=0.75, list=FALSE)
trainSet <- df[ index,]
testSet <- df[-index,]

model_glm <- train(trainSet[predictors], trainSet[[outcomeName]], method='glm', family="binomial", data = trainSet)

*changed trainSet[,outcomeName] to trainSet[[outcomeName]] for more explicit transformation to vector

Upvotes: 4

Related Questions