Leo
Leo

Reputation: 93

K-fold cross validation and Out of sample cross validation

What's the difference between K-fold cross validation and Out of sample cross validation? Could you use a few sentence to describe the steps for each CV method?

Upvotes: 0

Views: 1453

Answers (1)

Guillermo Mosse
Guillermo Mosse

Reputation: 462

K-fold cross validation is a type of out-of-sample cross validation. The name "out of sample" comes from the following fact: if we fit the model and compute the MSE on the training set, we will get an optimistically biased assessment of how well the model will fit an independent data set. This biased estimate is called the in-sample estimate of the fit (we would be using training samples), whereas the cross-validation estimate is an out-of-sample estimate.

In k-fold cross-validation, the original sample is randomly partitioned into k equal sized subsamples. Of the k subsamples, a single subsample is retained as the validation data for testing the model, and the remaining k − 1 subsamples are used as training data. The cross-validation process is then repeated k times, with each of the k subsamples used exactly once as the validation data. The k results can then be averaged to produce a single estimation. The advantage of this method over repeated random sub-sampling is that all observations are used for both training and validation, and each observation is used for validation exactly once.

For other methods, you can check wikipedia, they have excellents summaries there: https://en.wikipedia.org/wiki/Cross-validation_(statistics)#Types

Upvotes: 1

Related Questions