Sparkringo
Sparkringo

Reputation: 376

dplyr summarise multiple columns and output multiple dataframes

I want to be able to summarise multiple columns separately and have a separate dataframe output for each summary. Right now, I'm doing it manually:

Example:

manufacturer = mpg %>% 
  select(manufacturer) %>% 
  group_by(manufacturer) %>% 
  summarise(
    count = n()
  )

model = mpg %>% 
  select(model) %>% 
  group_by(model) %>% 
  summarise(
    count = n()
  )

## etc. for each column of mpg.

Is there a way to do this automatically in some kind of a loop? I want the dataframe names to be the column names.

Upvotes: 2

Views: 620

Answers (3)

Ronak Shah
Ronak Shah

Reputation: 389047

Another option is to get the data in long format using pivot_longer and count each value in each column. However, this would require to change all the column values to character. If needed as separate dataframe you may use group_split to split one dataframe into list of dataframes.

library(dplyr)
library(tidyr)

mpg %>%
  mutate(across(.fns = as.character)) %>%
  pivot_longer(cols = everything()) %>%
  count(name, value, name = "count") %>%
  group_split(name, .keep = FALSE)

[[1]]
# A tibble: 7 × 2
  value      count
  <chr>      <int>
1 2seater        5
2 compact       47
3 midsize       41
#4 minivan       11
#5 pickup        33
#6 subcompact    35
#7 suv           62

#[[2]]
# A tibble: 21 × 2
#   value count
#   <chr> <int>
# 1 11       20
# 2 12        8
# 3 13       21
# 4 14       19
# 5 15       24
#...
#...

As others have already pointed out it is better to keep data in a list than in smaller individual dataframes.

Upvotes: 2

Ma&#235;l
Ma&#235;l

Reputation: 52059

You just need count here. Put in a loop (using imap) over all columns:

library(tidyverse)
imap(mpg, ~ {nm1 <- .y
  count(data.frame(x = .x), x, name = "count") %>% 
    rename_with(~ nm1, 1)})

Then to put the data frames of your list into your global environment, use list2env.

Upvotes: 2

akrun
akrun

Reputation: 887223

We may loop over the column names

library(dplyr)
library(purrr)
lst1 <- map(setNames(names(mpg), names(mpg)),  
  ~ mpg %>% 
      select(all_of(.x)) %>% 
      group_by(across(all_of(.x))) %>%
      summarise(count = n()) )

It is better to keep it in a list. If we want different objects, use list2env

list2env(lst1, .GlobalEnv)

Upvotes: 2

Related Questions