Reputation: 843
When doing cross-validation for model selection, I found there are many ways to quote the "standard deviation" of the cross-validation scores (here "score" means an evaluation metric e.g. accuracy, AUC, loss, etc.)
1) One way is to calculate the standard deviation on the mean of the scores of K folds (= standard deviation of K folds / sqrt(K)).
2) The second way is to calculate just the standard deviation of the scores of K folds. An example can be found here:
http://scikit-learn.org/stable/auto_examples/svm/plot_svm_anova.html
3) Another way I don't fully understand. It seems to calculate the standard deviation of K folds / sqrt(N) where N is the size of the dataset...
http://scikit-learn.org/stable/auto_examples/exercises/plot_cv_diabetes.html
Personally I think 1) is correct, as we care more about the standard error on the sample mean (here = the average score of K folds validation) rather than the standard deviation of the sample. Can anyone explain which way is preferred?
Upvotes: 6
Views: 7646
Reputation: 11
There is not much contradiction in these cases.
The standard deviation is a measure of variation of the scores (if one compute one single score (for one of the k folds)). The standard error is a measure of variation of the mean of the scores for k folds.
When looking for the "true" value of the score, use the standard error in this way:
The true value of the score is
(These ranges are called confidence intervals.)
Upvotes: 1