Reputation: 99
I have a dataframe with data from a survey. I would like to produce a report in table format with the frequencies of each variable.
So working with the dataset mtcars, having this:
> count(mtcars, cyl)
cyl n
1 4 11
2 6 7
3 8 14
> count(mtcars, gear)
gear n
1 3 15
2 4 12
3 5 5
I would like to produce a table like this (or something similar):
variable | n |
---|---|
cyl | |
4 | 11 |
6 | 7 |
8 | 14 |
gear | |
3 | 15 |
4 | 12 |
5 | 5 |
Any idea as to how this may be achievable?
Upvotes: 0
Views: 1163
Reputation: 17134
We can write a nested pair of functions to map count
to multiple variables and row-bind the results, using a little tidy evaluation:
library(dplyr)
library(purrr)
count_multi <- function(.data, ...) {
count_var <- function(var, .data) {
.data %>%
count(Value = factor({{ var }})) %>% # coerce to factor to allow multiple
mutate( # var types and preserve ordering
Variable = as.character(ensym(var)),
.before = everything()
)
}
map_dfr(enquos(...), count_var, .data = .data)
}
mtcars2 <- mtcars %>%
mutate(
vs = factor(vs, labels = c("V", "S")),
am = factor(am, labels = c("manual", "automatic"))
)
mtcars2 %>%
count_multi(vs, am, cyl)
Output:
Variable Value n
1 vs V 18
2 vs S 14
3 am manual 19
4 am automatic 13
5 cyl 4 11
6 cyl 6 7
7 cyl 8 14
I believe you can use kableExtra::pack_rows()
to create subheaders for each Variable
in markdown.
Upvotes: 1
Reputation: 1873
The below gets us the output in slightly different format. However, it does allow for subset (using column variable
which OP's requirement does not.)
library(data.table)
df <- setDT(copy(mtcars))
# select columns as grouping by continuous variables is not appropriate
x <- c('cyl', 'gear')
y <- lapply(x, \(i) df[, .N, i])
names(y) <- x
y <- rbindlist(y, idcol=T, use.names=F)
names(y) <- c('variable', 'class', 'count')
variable class count
1: cyl 6 7
2: cyl 4 11
3: cyl 8 14
4: gear 4 12
5: gear 3 15
6: gear 5 5
Upvotes: 0