Reputation: 77
I'm really new to R and I'm struggling quite a bit to understand some concepts. If someone could help me to understand how to calculate the average error rate over 10-fold CV. This is the code I have below. But the mean error rate is coming out to 1 even though I know that the answer should be 0.149. I believe that it might have something to do with my predicted probabilities, perhaps I calculated them wrong. But if anyone notices something off in this code if you could point it out to me that would be great!
error_rate_value<-as.numeric()
for(i in 1:10){
fold_test <- training_data[folds[[i]],]
fold_train <- training_data[-folds[[i]],]
trained_model <- glm(O ~ T, data = fold_cv_train, family = binomial)
pred_prob <- predict(trained_model, fold_test, type='response')
glm.pred <- rep("Yes",dim(fold_test)[1])
glm.pred[pred_prob<.5] <- "No"
error_rate_fold <- mean(glm.pred != fold_test[,1])
error_rate_value <- append(error_rate_value, error_rate_fold)
}
mean(error_rate_value)
Upvotes: 1
Views: 232
Reputation: 77
I figured it out. My problem was that I was testing the wrong fold_test column. Changing that to the last column in my dataframe led to the correct answer!
Upvotes: 1