mql4beginner
mql4beginner

Reputation: 2243

How to present accuracy of different models using caret package in the same list

I'm trying to test models performances using the caret package.I got the results per each of the models but I wouldlike to get a list that will contain the accuracy and ROC of all the models together.How can I do it? Here is my toy data and two models:

dat <- read.table(text = " target birds    wolfs     snakes
        0        3        9         7
        1        3        8         4
        1        1        2         8
        0        1        2         3
        0        1        8         3
        1        6        1         2
        0        6        7         1
        1        6        1         5
        0        5        9         7
        1        3        8         7
        1        4        2         7
        0        1        2         3
        0        7        6         3
        1        6        1         1
        0        6        3         9
        1        6        1         1   ",header = TRUE)

Here are the two models:

svmRadial <- train(target ~ ., data = dat, method='svmRadial')
glm <- train(target ~ ., data = dat, method='glm')

I would like to get such a table an an output:

ModelName  Accuracy  ROC
svmRadial   0.95     0.74
glm         0.93     0.7

Upvotes: 0

Views: 2196

Answers (1)

cdeterman
cdeterman

Reputation: 19970

This is essentially a question on customizing the summaryFunction. You can see a similar question here. Here is a function that is a combination of the defaultSummary and twoClassSummary functions.

mySummary <- function(data, lev = NULL, model = NULL)
{
    requireNamespace("pROC")
    if (!all(levels(data[, "pred"]) == levels(data[, "obs"]))) 
        stop("levels of observed and predicted data do not match")
    rocObject <- try(pROC::roc.default(data$obs, data[, lev[1]]), 
                     silent = TRUE)
    rocAUC <- if (class(rocObject)[1] == "try-error"){ 
        NA
    }else{rocObject$auc}

    if (!is.factor(data$obs)) 
        data$obs <- factor(data$obs, levels = lev)
    Acc <- postResample(data[, "pred"], data[, "obs"])[1]

    out <- c(Acc, rocAUC)
    names(out) <- c("Accuracy","ROC")
    out
}


fitControl <- trainControl(classProbs = TRUE,
                           summaryFunction = mySummary)

set.seed(123)
svmRadial_acc_roc <- train(as.factor(target) ~ ., data = dat, method='svmRadial', trControl=fitControl)
glm_acc_roc <- train(as.factor(target) ~ ., data = dat, method='glm', trControl=fitControl)

I believe it is considered better practice to look at the distribution of the results. To do so, you would use the resamples function.

results <- resamples(list(svm=svmRadial_acc_roc, glm=glm_acc_roc))
summary(results)

Call:
summary.resamples(object = results)

Models: svm, glm 
Number of resamples: 25 

Accuracy 
      Min. 1st Qu. Median   Mean 3rd Qu.   Max. NA's
svm 0.2500  0.5000  0.625 0.6034  0.6667 1.0000    0
glm 0.1667  0.4286  0.500 0.4993  0.6000 0.7143    0

ROC 
      Min. 1st Qu. Median   Mean 3rd Qu. Max. NA's
svm 0.4444  0.5608 0.6667 0.7422     1.0    1    1
glm 0.4444  0.6250 0.6667 0.7108     0.8    1    0

That said, if you really want that simple table.

# svm had some cross-validation so pull 'best tune'
svm_result <- svmRadial_acc_roc$results[
    svmRadial_acc_roc$results$C == svmRadial_acc_roc$bestTune$C,
    c("Accuracy", "ROC")]
glm_result <- glm_acc_roc$results[,c("Accuracy", "ROC")]

# make data.frame
data.frame(ModelName = c("svmRadial", "glm"),
           Accuracy = c(svm_result$Accuracy, glm_result$Accuracy),
           ROC = c(svm_result$ROC, glm_result$ROC)
)

  ModelName  Accuracy       ROC
1 svmRadial 0.6034444 0.7421875
2       glm 0.4993333 0.7107778

Upvotes: 4

Related Questions