Reputation: 33
I am running k-fold repeated training with the caret package and would like to calculate the confidence interval for my accuracy metrics. This tutorial prints a caret training object that shows accuracy/kappa metrics and associated SD: https://machinelearningmastery.com/tune-machine-learning-algorithms-in-r/. However, when I do this, all that is listed are the metric average values.
control <- trainControl(method="repeatedcv", number=10, repeats=3, search="grid")
set.seed(12345)
tunegrid <- expand.grid(.mtry=4)
rf_gridsearch <- train(as.factor(gear)~., data=mtcars, method="rf",
metric="Accuracy",
tuneGrid=tunegrid,
trControl=control)
print(rf_gridsearch)
> print(rf_gridsearch)
Random Forest
32 samples
10 predictors
3 classes: '3', '4', '5'
No pre-processing
Resampling: Cross-Validated (10 fold, repeated 3 times)
Summary of sample sizes: 29, 28, 30, 29, 27, 28, ...
Resampling results:
Accuracy Kappa
0.8311111 0.7021759
Tuning parameter 'mtry' was held constant at a value of 4
Upvotes: 2
Views: 1389
Reputation: 39
The correct answer is this:
Upper interval = X_hat + z * (S/sqrt(n))
Lower interval = X_hat - z * (S/sqrt(n))
If you are dealing with proportions:
Upper interval = X_hat + z * sqrt( (p * (1-p))/n )
Lower interval = X_hat - z * sqrt( (p * (1-p))/n )
Upvotes: 0
Reputation: 3335
It looks like it is stored in the results variable of the resultant object.
> rf_gridsearch$results
mtry Accuracy Kappa AccuracySD KappaSD
1 4 0.7572222 0.6046465 0.2088411 0.3387574
A 95% confidence interval can be found using a critical z value of 1.96.
> rf_gridsearch$results$Accuracy+c(-1,1)*1.96*rf_gridsearch$results$AccuracySD
[1] 0.3478936 1.1665509
Upvotes: 1