vagvaf
vagvaf

Reputation: 99

Report frequency for multiple variables in a dataframe in R

I have a dataframe with data from a survey. I would like to produce a report in table format with the frequencies of each variable.

So working with the dataset mtcars, having this:

> count(mtcars, cyl)
  cyl  n
1   4 11
2   6  7
3   8 14
> count(mtcars, gear)
  gear  n
1    3 15
2    4 12
3    5  5

I would like to produce a table like this (or something similar):

variable n
cyl
4 11
6 7
8 14
gear
3 15
4 12
5 5

Any idea as to how this may be achievable?

Upvotes: 0

Views: 1163

Answers (2)

zephryl
zephryl

Reputation: 17134

We can write a nested pair of functions to map count to multiple variables and row-bind the results, using a little tidy evaluation:

library(dplyr)
library(purrr)

count_multi <- function(.data, ...) {
  count_var <- function(var, .data) {
    .data %>% 
      count(Value = factor({{ var }})) %>%  # coerce to factor to allow multiple
      mutate(                               # var types and preserve ordering
        Variable = as.character(ensym(var)),
        .before = everything()
      )
  }
  map_dfr(enquos(...), count_var, .data = .data)
}

mtcars2 <- mtcars %>% 
  mutate(
    vs = factor(vs, labels = c("V", "S")),
    am = factor(am, labels = c("manual", "automatic"))
  )

mtcars2 %>% 
  count_multi(vs, am, cyl)

Output:

  Variable     Value  n
1       vs         V 18
2       vs         S 14
3       am    manual 19
4       am automatic 13
5      cyl         4 11
6      cyl         6  7
7      cyl         8 14

I believe you can use kableExtra::pack_rows() to create subheaders for each Variable in markdown.

Upvotes: 1

Sweepy Dodo
Sweepy Dodo

Reputation: 1873

The below gets us the output in slightly different format. However, it does allow for subset (using column variable which OP's requirement does not.)

library(data.table)

df <- setDT(copy(mtcars))

# select columns as grouping by continuous variables is not appropriate
x <- c('cyl', 'gear')

y <- lapply(x, \(i) df[, .N, i])

names(y) <- x

y <- rbindlist(y, idcol=T, use.names=F)

names(y) <- c('variable', 'class', 'count')

   variable class count
1:      cyl     6     7
2:      cyl     4    11
3:      cyl     8    14
4:     gear     4    12
5:     gear     3    15
6:     gear     5     5

Upvotes: 0

Related Questions