dplyr summarise and then summarise_at in the same pipe

Question

This question has come up before and there are some solutions but none that I could find for this specific case. e.g.

my_diamonds <- diamonds %>% 
  mutate(blah_var1 = rnorm(n()),
         blah_var2 = rnorm(n()),
         blah_var3 = rnorm(n()),
         blah_var4 = rnorm(n()),
         blah_var5 = rnorm(n()))

my_diamonds %>% 
  group_by(cut) %>% 
  summarise(MaxClarity = max(clarity),
            MinTable = min(table), .groups = 'drop') %>% 
  summarise_at(vars(contains('blah')), mean)

Want a new df showing the max clarity, min table and mean of each of the blah variables. The above returned an empty tibble. Based on some other SO posts I tried using mutate and then summarise at:

my_diamonds %>% 
  group_by(cut) %>% 
  mutate(MaxClarity = max(clarity),
            MinTable = min(table)) %>% 
  summarise_at(vars(contains('blah')), mean)

This returns a tibble but only for the blah variables, MaxClarity and MinTable are missing.

Is there a way to combine summarise and summarise_at in the same dplyr chain?

akrun · Accepted Answer

One issue with the summarise is that after the first call of summarise, we get only the columns in the grouping i.e. the 'cut' along with and the summarised columns i.e. 'MaxClarity' and 'MinTable'. In addition, after the first summarise step, the grouping is removed with groups = 'drop'

library(dplyr) # version >= 1.0
my_diamonds %>% 
  group_by(cut) %>% 
  summarise(MaxClarity = max(clarity),
            MinTable = min(table),
            across(contains('blah'), mean, na.rm = TRUE), .groups = 'drop')

dplyr summarise and then summarise_at in the same pipe

Answers (1)

Related Questions