xiaoxiao87
xiaoxiao87

Reputation: 843

Which standard deviation of the cross-validation score?

When doing cross-validation for model selection, I found there are many ways to quote the "standard deviation" of the cross-validation scores (here "score" means an evaluation metric e.g. accuracy, AUC, loss, etc.)

1) One way is to calculate the standard deviation on the mean of the scores of K folds (= standard deviation of K folds / sqrt(K)).

2) The second way is to calculate just the standard deviation of the scores of K folds. An example can be found here:

http://scikit-learn.org/stable/auto_examples/svm/plot_svm_anova.html

3) Another way I don't fully understand. It seems to calculate the standard deviation of K folds / sqrt(N) where N is the size of the dataset...

http://scikit-learn.org/stable/auto_examples/exercises/plot_cv_diabetes.html

Personally I think 1) is correct, as we care more about the standard error on the sample mean (here = the average score of K folds validation) rather than the standard deviation of the sample. Can anyone explain which way is preferred?

Upvotes: 6

Views: 7646

Answers (1)

EKal-aa
EKal-aa

Reputation: 11

There is not much contradiction in these cases.

  1. standard deviation of K folds / sqrt(K) is the standard error of the score.
  2. in the mentioned link, they use the standard deviation and not the standard error of the score.
  3. in this link, they compute the standard error like in 1), but they use the variable name "n_folds" instead of "k". N (n_folds) is not the size of the dataset in this case.

The standard deviation is a measure of variation of the scores (if one compute one single score (for one of the k folds)). The standard error is a measure of variation of the mean of the scores for k folds.

When looking for the "true" value of the score, use the standard error in this way:

The true value of the score is

  • with about 68% probability in the range of (mean - standard error) to (mean + standard error)
  • with about 95% probability in the range of (mean - 2*standard error) to (mean + 2standard error)

(These ranges are called confidence intervals.)

Upvotes: 1

Related Questions