Reputation: 39
I am using R markdown and qwraps2 and dplyr packages
library(dplyr)
library(qrwaps2)
I have a dataset with one categorical variable and 2 numeric variables that I want to get the count and percentage of the categorical variable in the following format: count (percentage %)
The dataset is below:
structure(list(SIDE = c("Left", "Right", "Left", "Right", "Left",
"Right", "Right", "Right", "Right", "Right", "Left", "Left",
"Left", "Right", "Left", "Right", "Right", "Left", "Left", "Left",
"Left", "Right", "Right"), PREOP_mTFA = c(163.5, 164.9, 168.7,
170.3, 162.8, 166.7, 171, 165.9, 165.9, 170.8, 170.5, 173.3,
167.7, 170.7, 159, 170.9, 168.2, 171.2, 164, 166.6, 169.1, 171.2,
175.9), PREOP_mLDFA = c(86, 95, 90, 86, 92, 89, 92, 96, 90, 86,
89, 87, 93, 90, 98, 89, 90, 88, 92, 91, 89, 90, 88)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -23L))
I used the following code to get count and percentage but they come as separate.
DLO_TEST %>% group_by(SIDE) %>%
summarise(count = n())
DLO_TEST %>% group_by(SIDE) %>%
summarise(percent = 100 * n() / nrow(DLO_TEST))
I want to have one output in the format of count then percentage in parentheses with percentage sign% : count (percentage %). How to modify the code to do that?
I tried
qwraps2::n_perc(DLO_TEST$SIDE == Right)
This did not work. I think it works only for numeric variables.
Upvotes: 0
Views: 70
Reputation: 73377
Base R.
> dat$SIDE |>
+ table() |>
+ as.data.frame() |>
+ transform(frac=proportions(Freq)) |>
+ within(foo <- mapply(sprintf, '%s (%s%%)', Freq, round(frac, 3)*100))
Var1 Freq frac foo
1 Left 11 0.4782609 11 (47.8%)
2 Right 12 0.5217391 12 (52.2%)
Upvotes: 1
Reputation: 146020
Maybe something like this?
DLO_TEST |>
summarize(
count = n(),
.by = SIDE
) |>
mutate(
percent = scales::percent_format(accuracy = 0.1)(count / sum(count)),
your_string = glue::glue("{count} ({percent})")
)
# # A tibble: 2 × 4
# SIDE count percent your_string
# <chr> <int> <chr> <glue>
# 1 Left 11 47.8% 11 (47.8%)
# 2 Right 12 52.2% 12 (52.2%)
If you have more categorical variables, then this exact format doesn't make sense as they may have different numbers of levels. You could maybe pivot your data to a long format first so they are all in one column. You may also be interested in the table1
package which does a nice job automating tables like this.
Upvotes: 2