Reputation: 537
Simple question. Considering the data frame below, I want to count distinct IDs: one for all records and one after filtering on status. However, the %>%
doesn't seem to work here. I just want to have a single value as ouput (so for total
this should be 10, for closed
it should be 5), not a dataframe . Both # lines don't work
dat <- data.frame (ID = as.factor(c(1:10)),
status = as.factor(rep(c("open","closed"))))
total <- n_distinct(dat$ID)
#closed <- dat %>% filter(status == "closed") %>% n_distinct(dat$ID)
#closed <- dat %>% filter(status == "closed") %>% n_distinct(ID)
Upvotes: 0
Views: 693
Reputation: 887571
An option with data.table
library(data.table)
setDT(dat)[status == "closed"][, .(n = uniqueN(ID))]
Upvotes: 0
Reputation: 389175
n_distinct
expects a vector as input, you are passing a dataframe. You can do :
library(dplyr)
dat %>%
filter(status == "closed") %>%
summarise(n = n_distinct(ID))
# n
#1 5
Or without using filter
:
dat %>% summarise(n = n_distinct(ID[status == "closed"]))
You can add %>% pull(n)
to above if you want a vector back and not a dataframe.
Upvotes: 1