Banjo
Banjo

Reputation: 1251

Generate a confusion matrix for svm in e1071 for CV results

I did a classification with svm using e1071. The goal is to predict type through all other variables in dtm.

 dtm[140:145] %>% str()
 'data.frame':  385 obs. of  6 variables:
 $ think   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ actually: num  0 0 0 0 0 0 0 0 0 0 ...
 $ comes   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ able    : num  0 0 0 0 0 0 0 0 0 0 ...
 $ hours   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ type    : Factor w/ 4 levels "-1","0","1","9": 4 3 3 3 4 1 4 4 4 3 ...

To train/test the model, I used the 10-fold-cross-validation.

model <- svm(type~., dtm, cross = 10, gamma = 0.5, cost = 1)
summary(model)

Call:
svm(formula = type ~ ., data = dtm, cross = 10, gamma = 0.5, cost = 1)


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  radial 
       cost:  1 
     gamma:  0.5 

Number of Support Vectors:  385

 ( 193 134 41 17 )


Number of Classes:  4 

Levels: 
 -1 0 1 9

10-fold cross-validation on training data:

Total Accuracy: 50.12987 
Single Accuracies:
 52.63158 51.28205 52.63158 43.58974 60.52632 43.58974 57.89474 48.71795 
 39.47368 51.28205 

My question is how can I generate a confusion matrix for the results? Which columns of model do I have to put in table()or confusionMatrix() to get the matrix?

Upvotes: 2

Views: 7334

Answers (2)

missuse
missuse

Reputation: 19716

As far as I know there is no method to access the fold predictions in library e1071 when doing cross validation.

One easy way to do it:

some data:

library(e1071)
library(mlbench)
data(Sonar)

generate the folds:

k <- 10
folds <- sample(rep(1:k, length.out = nrow(Sonar)), nrow(Sonar))

run the models:

z <- lapply(1:k, function(x){
  model <- svm(Class~., Sonar[folds != x, ], gamma = 0.5, cost = 1, probability = T)
  pred <- predict(model, Sonar[folds == x, ])
  true <- Sonar$Class[folds == x]
  return(data.frame(pred = pred, true = true))
})

to generate confusion matrix for all left out samples:

z1 <- do.call(rbind, z)
caret::confusionMatrix(z1$pred, z1$true)

to generate for each:

lapply(z, function(x){
  caret::confusionMatrix(x$pred, x$true)
})

for reproducibility set the seed prior the fold creation.

In general if you do this sort of stuff often chose a higher level library such as mlr or caret.

Upvotes: 4

Davide Bottoli
Davide Bottoli

Reputation: 91

Suppose you want to create a confusion matrix of predictions and real values from dataset called dtm, where your target variable is called type. First of all you have to predict the value according to the model using:

prediction <- predict(model, dtm)

Then you can create the confusion matrix with the code:

library(caret)
confusionMatrix(prediction, dtm$type, dnn = c("Prediction", "Reference"))

Hope it's clear enough.

Upvotes: 1

Related Questions