Evaluation Error : Need at least one column for 'n_distinct()'

Question

I am using the R programming language. I have a data frame (my_file) with 2 columns: my_date (e.g. 2000-01-15, in factor format) and "blood_type" (also in factor format). I am trying to use the dplyr library to produce distinct counts by group (by month).

I figured out how to make non-distinct counts:

library(dplyr)

new_file <- my_file %>%
mutate(date = as.Date(my_date)) %>%
group_by(blood_type, month = format(date, "%Y-%m")) %>%
summarise(count = n())

But this does not work for distinct counts:

new_file <- my_file %>%
mutate(date = as.Date(my_date)) %>%
group_by(blood_type, month = format(date, "%Y-%m")) %>%
summarise(count = n_distinct())

Evaluation Error : Need at least one column for 'n_distinct()'

I tried to explicitly reference the column, but this produces an empty file:

new_file <- my_file %>%
mutate(date = as.Date(my_date)) %>%
group_by(blood_type, month = format(date, "%Y-%m")) %>%
summarise(count = n_distinct(my_file$blood_type))

Can someone please show me what I am doing wrong?

Thanks

Ronak Shah · Accepted Answer

If you want to count distinct blood_type for each month don't include it in group_by. Try :

library(dplyr)

new_file <- my_file %>%
  mutate(date = as.Date(my_date)) %>%
  group_by(month = format(date, "%Y-%m")) %>%
  summarise(count = n_distinct(blood_type))

Evaluation Error : Need at least one column for 'n_distinct()'

Answers (2)

Related Questions

Evaluation Error : Need at least one column for &#39;n_distinct()&#39;

Answers (2)

Related Questions

Evaluation Error : Need at least one column for 'n_distinct()'