Reputation: 161
I have multiple datasets where the response variable is always in the last column of the data frame. I want to run a GLM (logistic regression) and automate it. I call glm()
by position but this method always includes the last variable.
data(iris)
head(iris)
train<- iris
logit <- glm(train[,length(train)]~ . ,
data = train, family = "binomial")
summary(logit)
I tried writting train[,length(train)]~ . -train[,length(train)]
but it doesn't work.
Upvotes: 1
Views: 282
Reputation: 476
Quite verbose but I think that should work :
logit <- glm(formula(paste0(names(train)[length(train)], '~.')),
data = train,
family = "binomial")
or using tail
:
logit <- glm(formula(paste0(tail(names(train), 1), '~.')),
data = train,
family = "binomial")
Upvotes: 1