Brad
Brad

Reputation: 680

Using across function in dplyr

I have a dataframe which contains missing values.

# Create dataframe
df <- data.frame(Athlete = c(c(replicate(200,"Ali"), c(replicate(200,"Tyson")))),
                 Score = replicate(400, sample(c(1:20, NA), 1, rep = TRUE)))

My function groups factors, then counts rows which do not contain NA values.

library(dplyr)
Result <- df %>%
  dplyr::group_by(Athlete, .drop = TRUE) %>%
  dplyr::summarise_each(list(~sum(!is.na(.))))

I get the desired result. But there is a warning message.

`summarise_each_()` is deprecated as of dplyr 0.7.0.
Please use `across()` instead.

I'm trying to update the code base so the warning messages stop.

Note: The warning message also says;

This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 

So if the warning message is absent, reboot Rstudio and rerun script to produce the warning message.

Upvotes: 1

Views: 1023

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389235

summarise_each was replaced with summarise_at/summarise_all which is now replaced with across in dplyr 1.0.0.

library(dplyr)
df %>%
 group_by(Athlete) %>%
 summarise(across(everything(), ~sum(!is.na(.))))

#  Athlete Score
#  <chr>   <int>
#1 Ali       189
#2 Tyson     195

Although, if you have only one column to summarise as shown you can do this directly :

df %>%
  group_by(Athlete, .drop = TRUE) %>%
  summarise(Score  = sum(!is.na(Score)))

Upvotes: 4

Related Questions