Reputation: 299
I tried to understand the 5 fold cross validation algorithm in Caret package but I could not find out how to get train set and test set for each fold and I also could not find this from the similar suggested questions. Imagine if I want to do cross validation by random forest method, I do the following:
set.seed(12)
train_control <- trainControl(method="cv", number=5,savePredictions = TRUE)
rfmodel <- train(Species~., data=iris, trControl=train_control, method="rf")
first_holdout <- subset(rfmodel$pred, Resample == "Fold1")
str(first_holdout)
'data.frame': 90 obs. of 5 variables:
$ pred : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1
$ obs : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1
$ rowIndex: int 2 3 9 11 25 29 35 36 41 50 ...
$ mtry : num 2 2 2 2 2 2 2 2 2 2 ...
$ Resample: chr "Fold1" "Fold1" "Fold1" "Fold1" ...
Are these 90 observations in Fold1 used as training set? If yes then where is the test set for this fold?
Upvotes: 3
Views: 2091
Reputation: 3843
str(rfmodel)
Model performed stores everything in the below form. control
in the below stores the indexes for samples that went to Train and respective hold outs in index
and indexOut
.
names(rfmodel)
# [1] "method" "modelInfo" "modelType" "results" "pred"
# [6] "bestTune" "call" "dots" "metric" "control"
# [11] "finalModel" "preProcess" "trainingData" "resample" "resampledCM"
# [16] "perfNames" "maximize" "yLimits" "times" "levels"
# [21] "terms" "coefnames" "xlevels"
Path to indexes of Train and Hold Out samples
# Indexes of Hold Out Sets
rfmodel$control$indexOut
# Indexes of Train Sets for above hold outs
rfmodel$control$index
Upvotes: 0