bird
bird

Reputation: 3316

How to produce a similar (possibly better) confusion matrix table / data frame (as shown in the photo below) using R

I have confusion matrix results of my machine learning models and I have to present my results. I made the following table manually using Microsoft Word shown in the photo below. As you can see it is not a good-looking table and more importantly, it takes so much time to transfer the results one by one from R to Microsoft Word and do manual calculation of errors.

This is the table I would like to produce using R since most of my analysis is to be done in R. I am also very open to your suggestions to make it even nicer, since I will use the table in a scientific presentation.

enter image description here

For reproducibility, I used the code dput(cm_df) (which is my confusion matrix converted to data.frame using as.data.frame(cm_table)) and got this result:

structure(list(Prediction = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 
1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 
5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L), .Label = c("1", 
"2", "3", "4", "5", "6"), class = "factor"), Reference = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 
3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 
6L, 6L, 6L), .Label = c("1", "2", "3", "4", "5", "6"), class = "factor"), 
    Freq = c(1L, 0L, 0L, 0L, 0L, 0L, 1L, 9L, 0L, 0L, 1L, 0L, 
    1L, 2L, 12L, 1L, 2L, 0L, 0L, 4L, 1L, 0L, 1L, 1L, 0L, 7L, 
    1L, 0L, 15L, 0L, 0L, 0L, 2L, 1L, 1L, 1L)), class = "data.frame", row.names = c(NA, 
-36L))

Any ideas?

Upvotes: 0

Views: 1530

Answers (1)

user12728748
user12728748

Reputation: 8506

There are many options and packages for formatting tables, and they provide different output formats (e.g. markdown, html, pdf, docx,...). Here is one example using the huxtable package:

library(data.table)
library(huxtable)
library(dplyr)

# reformatted your cm_df data.frame
res <- dcast(as.data.table(cm_df), Prediction ~ Reference, value.var = "Freq")

# extracted the numeric matrix to calculate the statistics
mat <- data.matrix(res[,-1])

# set res as character (required for merging)
res[] <- lapply(res, as.character)

# calculate and format the statistics
eoc <- (rowSums(mat) - diag(mat))/rowSums(mat)
res[, `:=`(UA = paste0(round(100*(1-eoc)), "%"),
    `Error of Commission` = paste0(round(100*eoc), "%"))]
PA <- paste0(round(100*diag(mat)/colSums(mat)), "%")
EO <- paste0(round(100*(1-diag(mat)/colSums(mat))), "%")

# combine column statistics with res
res.tab <- rbind(res, setNames(transpose(data.table(PA=PA, `Er. Omission`=EO), 
  keep.names = "Prediction"), colnames(res)[1:7]), fill=TRUE)

# format the table
out <- as_huxtable(res.tab) %>% 
    set_bold(1, everywhere, TRUE) %>% 
    set_bold(everywhere, 1, TRUE) %>%
    set_bottom_border(1, everywhere) %>% 
    set_bottom_border(7, everywhere) %>% 
    set_left_border(everywhere, c(2,8), TRUE) %>% 
    set_align(1, everywhere, "center") %>% 
    set_align(everywhere, 1, "center") %>% 
    set_align(c(2:9), c(2:9), "right") %>% 
    set_col_width(c(0.4, rep(0.2, 6), rep(.3,2))) %>% 
    set_position("left")

# print table to screen (usually would export in preferred format)
print_screen(out)
#>     Prediction │    1     2     3      4     5  
#> ───────────────┼────────────────────────────────
#>              1 │    1     1     1      0     0  
#>              2 │    0     9     2      4     7  
#>              3 │    0     0    12      1     1  
#>              4 │    0     0     1      0     0  
#>              5 │    0     1     2      1    15  
#>              6 │    0     0     0      1     0  
#> ───────────────┼────────────────────────────────
#>             PA │ 100%   82%   67%     0%   65%  
#>   Er. Omission │   0%   18%   33%   100%   35%  
#> 
#> Column names: Prediction, 1, 2, 3, 4, 5, 6, UA, Error of Commission
#> 
#> 6/9 columns shown.

Edit:

As requested, you could add the following code to get some annotations:

# add an empty first column and merge cells
out <- merge_down(as_huxtable(cbind(rep("", 9), out)), 2:8, 1)

# add desired label
out[2,1] <- "Classification"

# add top caption and rotate text in first column
out %>% 
    set_caption("Reference") %>% 
    set_rotation(everywhere, 1, 90)

output (html version):

Upvotes: 1

Related Questions