Ali
Ali

Reputation: 1080

Error while using n() inside summarise_at()

While using n() within summarise_at(), I obtain this error:

Error: n() should only be called in a data context
Call `rlang::last_error()` to see a backtrace

Others have suggested this could be a masking issue of dplyr with plyr, two solutions are:

  1. Replace summarise_at() with `dplyr::summarise_at()'
  2. Call detach("package:plyr", unload=TRUE)

Neither have removed this error and I'm curious to understand what is causing it. Here is a reproducible example which should result in the same error:

Df <- data.frame(
  Condition = c(rep("No", 20), rep("Yes",20)),
  Height = c(rep(1,10),rep(2,10),rep(1,10),rep(2,10)),
  Weight = c(rep(10,5),rep(20,5),rep(30,5), rep(40,5))
)

x <- c("Height","Weight")

Df %>% 
  group_by(Condition) %>% 
  summarise_at(vars(one_of(x)), c(mean = mean, sd = sd, count = n()))

Note: If you remove count = n() the code runs without any issue

Upvotes: 1

Views: 1113

Answers (1)

caldwellst
caldwellst

Reputation: 5956

I believe it is because n() works on the data source itself within mutate, filter, or summarize, so isn't a vectorized function. Just use length instead as the vectorized version.

Df %>% 
  group_by(Condition) %>% 
  summarise_at(vars(one_of(x)), c(mean = mean, sd = sd, count = length))

If you want to only have one count column, then:

Df %>% 
  group_by(Condition) %>%
  mutate(count = n()) %>%
  group_by(Condition, count) %>%
  summarise_at(vars(one_of(x)), c(mean = mean, sd = sd))

Upvotes: 7

Related Questions