Ahmed Mabrouk
Ahmed Mabrouk

Reputation: 39

Combining count and percentage in a Table (Categorical variable)

I am using R markdown and qwraps2 and dplyr packages

library(dplyr)
library(qrwaps2)

I have a dataset with one categorical variable and 2 numeric variables that I want to get the count and percentage of the categorical variable in the following format: count (percentage %)

The dataset is below:

structure(list(SIDE = c("Left", "Right", "Left", "Right", "Left", 
"Right", "Right", "Right", "Right", "Right", "Left", "Left", 
"Left", "Right", "Left", "Right", "Right", "Left", "Left", "Left", 
"Left", "Right", "Right"), PREOP_mTFA = c(163.5, 164.9, 168.7, 
170.3, 162.8, 166.7, 171, 165.9, 165.9, 170.8, 170.5, 173.3, 
167.7, 170.7, 159, 170.9, 168.2, 171.2, 164, 166.6, 169.1, 171.2, 
175.9), PREOP_mLDFA = c(86, 95, 90, 86, 92, 89, 92, 96, 90, 86, 
89, 87, 93, 90, 98, 89, 90, 88, 92, 91, 89, 90, 88)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -23L))

I used the following code to get count and percentage but they come as separate.

DLO_TEST %>% group_by(SIDE) %>% 
  summarise(count = n())
DLO_TEST %>% group_by(SIDE) %>% 
  summarise(percent = 100 * n() / nrow(DLO_TEST))

I want to have one output in the format of count then percentage in parentheses with percentage sign% : count (percentage %). How to modify the code to do that?

I tried

qwraps2::n_perc(DLO_TEST$SIDE == Right) 

This did not work. I think it works only for numeric variables.

Upvotes: 0

Views: 70

Answers (2)

jay.sf
jay.sf

Reputation: 73377

Base R.

> dat$SIDE |> 
+   table() |> 
+   as.data.frame() |> 
+   transform(frac=proportions(Freq)) |> 
+   within(foo <- mapply(sprintf, '%s (%s%%)', Freq, round(frac, 3)*100))
   Var1 Freq      frac        foo
1  Left   11 0.4782609 11 (47.8%)
2 Right   12 0.5217391 12 (52.2%)

Upvotes: 1

Gregor Thomas
Gregor Thomas

Reputation: 146020

Maybe something like this?

DLO_TEST |>
  summarize(
    count = n(),
    .by = SIDE
  ) |>
  mutate(
    percent = scales::percent_format(accuracy = 0.1)(count / sum(count)),
    your_string = glue::glue("{count} ({percent})")
  )
# # A tibble: 2 × 4
#   SIDE  count percent your_string
#   <chr> <int> <chr>   <glue>     
# 1 Left     11 47.8%   11 (47.8%) 
# 2 Right    12 52.2%   12 (52.2%) 

If you have more categorical variables, then this exact format doesn't make sense as they may have different numbers of levels. You could maybe pivot your data to a long format first so they are all in one column. You may also be interested in the table1 package which does a nice job automating tables like this.

Upvotes: 2

Related Questions