itsMeInMiami
itsMeInMiami

Reputation: 2783

When using `dplyr::summarise()` with the `across()` function can I mix the list and formula syntax

I would like to use dplyr::summarise() with the dplyr::across() function to produce a table that has the number of non-missing values, the mean and standard deviation for a couple variables. I can get the information on the missing values and the mean using the purrr modeling syntax but I can't figure out how to get the summaries into a single table without using multiple summarise calls and then bind_cols()

iris %>%
  group_by(Species) %>%
  summarise(across(starts_with("Sepal"), ~sum(!is.na(.))))

iris %>%
  group_by(Species) %>%
  summarise(across(starts_with("Sepal"), ~mean(., na.rm = TRUE)))

Is there a way to combine the list syntax:

iris %>%
  group_by(Species) %>%
  summarise(across(starts_with("Sepal"), list(mean = mean, sd = sd)))  

with the purrr function syntax shown above to get the number of not missing values, the mean and standard deviation all at once?

Upvotes: 4

Views: 600

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389355

To apply multiple functions in the same across statement you can use the list syntax as :

library(dplyr)

iris %>%
  group_by(Species) %>%
  summarise(across(starts_with("Sepal"), list(sum = ~sum(!is.na(.)), 
                                              mean = mean, sd = sd)))

Upvotes: 7

Related Questions