annatn998
annatn998

Reputation: 77

Calculating the error rate over a 10-fold CV

I'm really new to R and I'm struggling quite a bit to understand some concepts. If someone could help me to understand how to calculate the average error rate over 10-fold CV. This is the code I have below. But the mean error rate is coming out to 1 even though I know that the answer should be 0.149. I believe that it might have something to do with my predicted probabilities, perhaps I calculated them wrong. But if anyone notices something off in this code if you could point it out to me that would be great!

error_rate_value<-as.numeric()
for(i in 1:10){
        fold_test <- training_data[folds[[i]],]
        fold_train <- training_data[-folds[[i]],]
        trained_model <- glm(O ~ T, data = fold_cv_train, family = binomial)
        pred_prob <- predict(trained_model, fold_test, type='response')
        glm.pred <- rep("Yes",dim(fold_test)[1])
        glm.pred[pred_prob<.5] <- "No"
        error_rate_fold <- mean(glm.pred != fold_test[,1])
        error_rate_value <- append(error_rate_value, error_rate_fold)
}

mean(error_rate_value)

Upvotes: 1

Views: 232

Answers (1)

annatn998
annatn998

Reputation: 77

I figured it out. My problem was that I was testing the wrong fold_test column. Changing that to the last column in my dataframe led to the correct answer!

Upvotes: 1

Related Questions