Martina Morris
Martina Morris

Reputation: 165

Error: `n()` must only be used inside dplyr verbs

Running R 4.0.2 and dplyr 1.0.2

I am trying to use n = n() in a summarize call on a srvyr object:

relduration_by_age_grp <- l %>% 
  filter(ongoing == 0 & ptype == i) %>% 
  select(ego.id, ptype, age.grp, ego.age.grp, duration, ego.wawt) %>%
  mutate(min.age.grp = ifelse(age.grp < ego.age.grp, 
                              age.grp,
                              ego.age.grp)) %>%
  srvyr::as_survey(ids=1, weights=ego.wawt) %>%
  group_by(ptype, min.age.grp) %>%
  summarize(n = n(),
            wtd.median = srvyr::survey_median(duration, na.rm=TRUE),
            wtd.mean = srvyr::survey_mean(duration, na.rm=TRUE), 
            median = srvyr::unweighted(median(duration, na.rm=TRUE)),
            mean = srvyr::unweighted(mean(duration, na.rm=TRUE)))

Based on other questions/answers, I've also tried using dplyr::summarize(n = dplyr::n(), but that results in the same error. Is the problem that it is not possible to use dplyr n() on a srvyr object? There does not appear to be a similar function in srvyr that can be used in a summarize call.

thanks!

Upvotes: 15

Views: 39659

Answers (4)

Satyam Saxena
Satyam Saxena

Reputation: 11

Use this - dplyr::summarise() to run the code for counting the number of data points in your database.

Upvotes: 1

Giuseppe D&#39;alterio
Giuseppe D&#39;alterio

Reputation: 229

Maybe is because you loaded a package, such as "operators", that masks "%>%" from dplyr package.

Upvotes: 1

Hammao
Hammao

Reputation: 879

The cause of this error is R’s confusion with which summarize function (dplyr vs. plyr) it should use.

Fortunately, we can tell R explicitly the package that we want to use by specifying the name and :: in front of the function.

so use dplyr::summarise()

Upvotes: 25

Ben Bolker
Ben Bolker

Reputation: 226162

As far as I can tell, unlike dplyr (which accepts pretty much any summary function that returns a scalar, as well as its own specialized functions such as n()), srvyr::summarize gives you a limited choice of functions: from ?srvyr::summarize,

Summarise for ‘tbl_svy’ objects accepts several specialized functions. [emphasis added]

i.e., survey_mean, survey_total, survey_ratio, and a couple of others

Here's a hack that seems to work: calculate the sum (survey_total) of the inverse weights.

library(srvyr)
data(api, package="survey")
aa <- (apistrat 
      %>% as_survey_design(strata=stype, weights=pw) 
      %>% group_by(stype) 
)
aa %>% summarize(n=survey_total(1/pw))

This matches table(apistrat$stype)

Upvotes: 6

Related Questions