doe doe
doe doe

Reputation: 33

trainControl in caret package

In caret package, there is a thing called trainControl that allow us to perform variety of cross validation. To perform 10-fold cross-validation, one would use

fitControl <- trainControl(method= "repeatedcv", number = 10, repeats = 10)
fitJ48_10_fold <- train(x = x, y =y, method = "J48", trControl= fitControl)

while for training set, it is

fitControl <- trainControl(method= "none")
fitJ48train <- train(x = x, y =y, method = "J48", trControl= fitControl)

However, confusion matrix of these model show the same for both 10-fold and training.

Activity <- predict(fitJ48_10_fold, newdata = Train)
confusionMatrix(Activity, Train$Activity)

Activity <- predict(fitJ48train, newdata = Train)
confusionMatrix(Activity, Train$Activity)

I used the weka classifier GUI and indeed the performance of J48 from 10-fold cross validation is lower than that of training set. Am I wrong to suspect that the trainControl from caret isn't working or I pass this in a wrong way?

Upvotes: 2

Views: 11534

Answers (1)

topepo
topepo

Reputation: 14331

Am I wrong to suspect that the trainControl from caret isn't working or I pass this in a wrong way?

A little. For J48, there is a tuning parameter but the default grid only fits a single value C = 0.25. The final model will be the same no matter what value of method that you use in trainControl so the confusion matrices will always be the same.

Max

Upvotes: 1

Related Questions