Reputation: 116
I am trying to create a model using Caret with a tune grid
svmGrid <- expand.grid(C = c(0.0001,0.001,0.01,0.1,1,10,20,30,40,50,100))
and then once again with a subset of this grid:
svmGrid <- expand.grid(C = c(0.0001,0.001,0.01,0.1,1,10,20,30,40,50))
The problem is that I get different "best tune" and "resampling results across tuning parameters", although the C parameter value that was chosen for the first tune grid, is also appear in the second tune grid.
I also encounter these discrepancies when using different options for the sampling parameter and also when using different summaryFunction options in trainControl()
Needless to say, since different best model is selected every time, it effects the prediction results on a test sets.
Any one has a clue why is it happening?
Reproducible data set:
library(caret)
library(doMC)
registerDoMC(cores = 8)
set.seed(2969)
imbal_train <- twoClassSim(100, intercept = -20, linearVars = 20)
imbal_test <- twoClassSim(100, intercept = -20, linearVars = 20)
table(imbal_train$Class)
Run using the first tune grid
svmGrid <- expand.grid(C = c(0.0001,0.001,0.01,0.1,1,10,20,30,40,50,100))
up_fitControl = trainControl(method = "cv", number = 10 , savePredictions = TRUE, allowParallel = TRUE, sampling = "up", seeds = NA)
set.seed(5627)
up_inside <- train(Class ~ ., data = imbal_train,
method = "svmLinear",
trControl = up_fitControl,
tuneGrid = svmGrid,
scale = FALSE)
up_inside
First run output:
> up_inside
Support Vector Machines with Linear Kernel
100 samples
25 predictors
2 classes: 'Class1', 'Class2'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 90, 91, 90, 90, 89, 90, ...
Addtional sampling using up-sampling
Resampling results across tuning parameters:
C Accuracy Kappa Accuracy SD Kappa SD
1e-04 0.7734343 0.252201364 0.1227632 0.3198165
1e-03 0.8225253 0.396439198 0.1245455 0.3626456
1e-02 0.7595960 0.116150973 0.1431780 0.3046825
1e-01 0.7686869 0.051430454 0.1167093 0.2712062
1e+00 0.7695960 -0.004261294 0.1162279 0.2190151
1e+01 0.7093939 0.111852756 0.2030250 0.3810059
2e+01 0.7195960 0.040458804 0.1932690 0.2580560
3e+01 0.7195960 0.040458804 0.1932690 0.2580560
4e+01 0.7195960 0.040458804 0.1932690 0.2580560
5e+01 0.7195960 0.040458804 0.1932690 0.2580560
1e+02 0.7195960 0.040458804 0.1932690 0.2580560
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was C = 0.001.
Run using the second tune grid
svmGrid <- expand.grid(C = c(0.0001,0.001,0.01,0.1,1,10,20,30,40,50))
up_fitControl = trainControl(method = "cv", number = 10 , savePredictions = TRUE, allowParallel = TRUE, sampling = "up", seeds = NA)
set.seed(5627)
up_inside <- train(Class ~ ., data = imbal_train,
method = "svmLinear",
trControl = up_fitControl,
tuneGrid = svmGrid,
scale = FALSE)
up_inside
Second run output:
> up_inside
Support Vector Machines with Linear Kernel
100 samples
25 predictors
2 classes: 'Class1', 'Class2'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 90, 91, 90, 90, 89, 90, ...
Addtional sampling using up-sampling
Resampling results across tuning parameters:
C Accuracy Kappa Accuracy SD Kappa SD
1e-04 0.8125253 0.392165694 0.13043060 0.3694786
1e-03 0.8114141 0.375569633 0.12291273 0.3549978
1e-02 0.7995960 0.205413345 0.06734882 0.2662161
1e-01 0.7495960 0.017139266 0.09742161 0.2270128
1e+00 0.7695960 -0.004261294 0.11622791 0.2190151
1e+01 0.7093939 0.111852756 0.20302503 0.3810059
2e+01 0.7195960 0.040458804 0.19326904 0.2580560
3e+01 0.7195960 0.040458804 0.19326904 0.2580560
4e+01 0.7195960 0.040458804 0.19326904 0.2580560
5e+01 0.7195960 0.040458804 0.19326904 0.2580560
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was C = 1e-04.
Upvotes: 4
Views: 1321
Reputation: 3121
If you don't provide seeds in caret
, it'll choose them for you. Since you have different lengths for your grid, the seeds will vary ever so slightly for your folds.
Below, I've pasted the example, where I just renamed your second model so the output for the comparison is easier to get:
> up_inside$control$seeds[[1]]
[1] 825016 802597 128276 935565 324036 188187 284067 58853 923008 995461 60759
> up_inside2$control$seeds[[1]]
[1] 825016 802597 128276 935565 324036 188187 284067 58853 923008 995461
> up_inside$control$seeds[[2]]
[1] 966837 256990 592077 291736 615683 390075 967327 349693 73789 155739 916233
# See how the first seed here is the same as the last seed of the first model
> up_inside2$control$seeds[[2]]
[1] 60759 966837 256990 592077 291736 615683 390075 967327 349693 73789
If you now go ahead and set your own seeds, you'll get the same output:
# Seeds for your first train
myseeds <- list(c(1:10,1000), c(11:20,2000), c(21:30, 3000),c(31:40, 4000),c(41:50, 5000),
c(51:60, 6000),c(61:70, 7000),c(71:80, 8000),c(81:90, 9000),c(91:100, 10000), c(343))
# Seeds for your second train
myseeds2 <- list(c(1:10), c(11:20), c(21:30),c(31:40),c(41:50),c(51:60),
c(61:70),c(71:80),c(81:90),c(91:100), c(343))
> up_inside
Support Vector Machines with Linear Kernel
100 samples
25 predictor
2 classes: 'Class1', 'Class2'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 90, 91, 90, 90, 89, 90, ...
Addtional sampling using up-sampling
Resampling results across tuning parameters:
C Accuracy Kappa
1e-04 0.7714141 0.239823027
1e-03 0.7914141 0.332834590
1e-02 0.7695960 0.207000745
1e-01 0.7786869 0.103957926
1e+00 0.7795960 0.006849817
1e+01 0.7093939 0.111852756
2e+01 0.7195960 0.040458804
3e+01 0.7195960 0.040458804
4e+01 0.7195960 0.040458804
5e+01 0.7195960 0.040458804
1e+02 0.7195960 0.040458804
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was C = 0.001.
> up_inside2
Support Vector Machines with Linear Kernel
100 samples
25 predictor
2 classes: 'Class1', 'Class2'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 90, 91, 90, 90, 89, 90, ...
Addtional sampling using up-sampling
Resampling results across tuning parameters:
C Accuracy Kappa
1e-04 0.7714141 0.239823027
1e-03 0.7914141 0.332834590
1e-02 0.7695960 0.207000745
1e-01 0.7786869 0.103957926
1e+00 0.7795960 0.006849817
1e+01 0.7093939 0.111852756
2e+01 0.7195960 0.040458804
3e+01 0.7195960 0.040458804
4e+01 0.7195960 0.040458804
5e+01 0.7195960 0.040458804
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was C = 0.001.
Upvotes: 4