R caret / How does cross-validation for train within rfe work

Question

I have a question regarding the rfe function from the caret library. On the caret-homepage link they give the following RFE algorithm: algorithm

For this example I am using the rfe function with 3-fold cross-validation and the train function with a linear-SVM and 5-fold cross-validation.

library(kernlab)
library(caret)
data(iris)

# parameters for the tune function, used for fitting the svm
trControl <- trainControl(method = "cv", number = 5)

# parameters for the RFE function
rfeControl <- rfeControl(functions = caretFuncs, method = "cv",
                     number= 4, verbose = FALSE )

rf1 <- rfe(as.matrix(iris[,1:4]), as.factor(iris[,5]) ,sizes = c( 2,3) ,  
           rfeControl = rfeControl, trControl = trControl, method = "svmLinear")

From the algorithm above I assumed that the algorithm would work with 2 nested cross-validations:
1. rfe would split the data (150 samples) into 3 folds
2. the train function would be run on the training-set (100 samples) with 5 fold cross validation to tune the model parameters - with subsequent RFE.

What confuses me is that when I take a look on the results of the rfe function:

> lapply(rf1$control$index, length)
$Fold1
[1] 100
$Fold2
[1] 101
$Fold3
[1] 99

> lapply(rf1$fit$control$index, length)
$Fold1
[1] 120
$Fold2
[1] 120
$Fold3
[1] 120
$Fold4
[1] 120
$Fold5
[1] 120

From that it appears that the size of the training sets from the 5-fold cv is 120 samples when I would expect a size of 80. ??

So it would be great if someone could clarify how rfe and train work together.

Cheers

> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] pROC_1.5.4      e1071_1.6-1     class_7.3-5     caret_5.15-048 
 [5] foreach_1.4.0   cluster_1.14.3  plyr_1.7.1      reshape2_1.2.1 
 [9] lattice_0.20-10 kernlab_0.9-15 

loaded via a namespace (and not attached):
 [1] codetools_0.2-8 compiler_2.15.1 grid_2.15.1     iterators_1.0.6
 [5] stringr_0.6.1   tools_2.15.1

R caret / How does cross-validation for train within rfe work

Answers (1)

Related Questions