Cross-validating a logistic regression in R

Question

I am running a logistic regression a binary DV with two predictors (gender, political leaning: binary, continuous). I need help getting my GLMs to run in a cross-validation! I can't my code to work despite reclassifying the variables multiple times. I'm not sure what's going on.

Here is the code I have:

`

#######################################################
#     Cross-Validation of the Logistic Regression
#######################################################


gen <- as.numeric(choicelife.data$gender)
lnc <- as.numeric(choicelife.data$lc)
procprol <-as.numeric(choicelife.data$views)

# This code could be useful
nCV <- 50
MSE_1 <- numeric(nCV)
MSE_2 <- numeric(nCV)

folds <- cut(sample(n),breaks=nCV,labels=FALSE)

#Perform n.folds fold cross validation
i <- 1
for(i in 1:nCV){

  #Segement your data by fold using the which() function 
  testIndexes <- which(folds==i,arr.ind=TRUE)
  testData <- choicelife.data[testIndexes, ]
  trainData <- choicelife.data[-testIndexes, ]

  # Models
  mod1<- glm(views ~ gen,
             family=binomial(link=logit), data=trainData)

  mod2<- glm(views ~ gen + lnc,
             family=binomial(link=logit), data=trainData)

  # Get predictions
  pred_1 <- predict(mod1, newdata = testData)
  pred_2 <- predict(mod2, newdata = testData)

  # Calculate MSE
  MSE_1[i] <- mean((testData$views - pred_1)^2)
  MSE_2[i] <- mean((testData$views - pred_2)^2)
}
warnings()

# mean MSEs
mean(MSE_1) 
mean(MSE_2) 

# get differences
diffs <- MSE_1 - MSE_2

# get 95% CIs
meandiff <- mean(diffs) 
sddiff <- sd(diffs) 
c(meandiff-2*sddiff, meandiff+2*sddiff) # 95% Confidence interval (n, n)

Cross-validating a logistic regression in R

Answers (1)

Related Questions