Lee
Lee

Reputation: 369

Plot ROC curve for glm in R

In my actual data, there are so many columns so I made the code using the position of the column. I want to plot ROC curve after logistic regression. To demonstrate what I want to do, I made a simple data df

df<-data.frame(pass=c(0,1,0,0,1,1,1,0,0,1,0,1,1,1,0,0,0,0,0,1),
               math=c(23,46,66,78,77,88,90,99,21,34,56,55,67,67,88,89,90,12,11,34),
               physics=c(87,43,56,78,44,56,90,99,21,32,45,46,46,77,88,90,32,12,34,57),
               bmi=c(23,24,34,21,18,19,26,37,35,21,12,13,41,25,27,28,34,32,21,22))

#split train and test set
sample <- sample.int(n = nrow(df), size = floor(.7*nrow(df)), replace = F)
train <- df[sample, ]
test  <- df[-sample, ]

x        <- as.matrix(data.frame(train[,2:4]))
y<-as.matrix(train$pass)

glm.fit<-glm(y~x,family="binomial",data=train)
#I cannot change the code above , but I an change the code below to plot ROC.
glm.probs<-predict(glm.fit,test,type="response")

However, the last line gives me an error message that the rows don't match each other.What I want to do is fit the logistic model to the train set and plot the ROC curve from the test set. I already made the code for my actual data, so I cannot change the fitting code, but I can change the code starting from glm.probs<-predict(glm.fit,test,type="response").

My goal is to plot ROC curve and get the auc value. I need some help.

Upvotes: 0

Views: 527

Answers (1)

Dave2e
Dave2e

Reputation: 24139

Does this work for you?

df<-data.frame(pass=c(0,1,0,0,1,1,1,0,0,1,0,1,1,1,0,0,0,0,0,1),
               math=c(23,46,66,78,77,88,90,99,21,34,56,55,67,67,88,89,90,12,11,34),
               physics=c(87,43,56,78,44,56,90,99,21,32,45,46,46,77,88,90,32,12,34,57),
               bmi=c(23,24,34,21,18,19,26,37,35,21,12,13,41,25,27,28,34,32,21,22))

#split train and test set
sample <- sample.int(n = nrow(df), size = floor(.7*nrow(df)), replace = F)
train <- df[sample, ]
test  <- df[-sample, ]

glm.fit<-glm(pass ~ ., family="binomial", data=train)

probs <- glm.probs<-predict(glm.fit, newdata=test, type="response")

tabledata <- data.frame(probs, acutual =test$pass)

cutoff <- 0.5
tabledata$predicted <- tabledata$probs < cutoff
tableout<-table(tabledata$acutual, tabledata$predicted)
tableout

    FALSE TRUE
  0     2    2
  1     0    2

Upvotes: 0

Related Questions