Pelide
Pelide

Reputation: 528

How to obtain F1, precision, recall and confusion matrix

the goal of my project is to predict the accuracy level of some textual descriptions.

I made the vectors with FASTTEXT.

TSV output:

 label lenght
1     0   1:43
2     0   1:10
3     0    1:8
4     0  1:110
5     1  1:105
6     0  1:446

Then I'm processing the .tsv files using the library e1071 by this script:


library (e1071)

accuracy.model = read.table(file = 'file.tsv', sep = '\t', header = FALSE, col.names= c( "label", "lenght" ))

head(accuracy.model)

classifier = svm( formula = label ~ .,
                  data = accuracy.model,
                  type = 'C-classification',
                  kernel = 'radial',
                  cost = 32,
                  gamma = 8,
                  cross  = 10)

classifier

Doing the cross validation (10 folds) I'm able to retrieve the overall accuracy percentage.

I would like to have also the F1 score, precision and recall value.

For the confusion matrix I went through some others stack thread and I figured out that has to be done using the caret library but I don't know how to do it.

Suggestions?

Regards

Upvotes: 1

Views: 1717

Answers (1)

StupidWolf
StupidWolf

Reputation: 46888

Let's say we fit a model like this:

library(caret)
library(e1071)
data=iris
data$Species = ifelse(data$Species=="versicolor","v","o")

classifier = svm( formula = Species ~ .,
                  data = data,
                  type = 'C-classification',
                  kernel = 'radial',
                  cost = 32,
                  gamma = 8,
                  cross  = 10)

Then we get the confusion matrix:

mat = table(classifier$fitted,data$Species)

And apply the caret function:

confusionMatrix(mat)$byClass

         Sensitivity          Specificity       Pos Pred Value 
           1.0000000            1.0000000            1.0000000 
      Neg Pred Value            Precision               Recall 
           1.0000000            1.0000000            1.0000000 
                  F1           Prevalence       Detection Rate 
           1.0000000            0.6666667            0.6666667 
Detection Prevalence    Balanced Accuracy 
           0.6666667            1.0000000 

You can apply it in your case.

Upvotes: 1

Related Questions