Reputation: 5819
library(dplyr)
distinct(mtcars, mpg)
displays the unique occurrences of mpg classes in mtcars.
n_distinct(mtcars, mpg)
counts them and displays the correct count 32
.
distinct(mtcars, cyl)
displays the unique occurrences of cylinder classes in mtcars.
n_distinct(mtcars, cyl)
yields an error. Why doesn't it work like the mpg example above? I get this incorrect error... the object cyl is in the mtcars data frame, I assure you of that.
Error in n_distinct_multi(list(...), na.rm) : object 'cyl' not found
Upvotes: 2
Views: 1337
Reputation: 2105
The dplyr::n_distinct()
function is not a table verb like mutate()
, filter()
, etc. Its ...
parameter should be "vectors of values" (per official documentation).
So when you say dplyr::n_distinct(mtcars, mpg)
, what is really happening
is that the unique values of the first argument mtcars
are being counted.
Since it has 32 distinct rows, the value is 32
. In the final example you
provide, cyl
is not recognized because there is no object called cyl
-- the reason that mpg
is being recognized is that mpg
refers to the
dataset ggplot2::mpg
, not to the column of mtcars
with the same name!
To see what I mean, run the following:
dplyr::n_distinct(mtcars) # 32
dplyr::n_distinct(ggplot2::mpg) # 225
dplyr::n_distinct(mtcars, mpg) # 32
dplyr::n_distinct(mtcars, ggplot2::mpg) # 32
If you want to count the number of unique values in mtcars$cyl
and mtcars$mpg
,
then just use:
dplyr::n_distinct(mtcars$cyl) # 3
dplyr::n_distinct(mtcars$mpg) # 25
A tricky one!
Upvotes: 9
Reputation: 4362
Your call to n_distinct(mtcars, mpg)
is not returning the correct value, which would be 25. Instead, this line is giving you the number of unique rows in the whole table mtcars
, which is 32 according to the output of distinct(mtcars)
.
What you want to call is n_distinct(mtcars$mpg)
which returns 25, or similarly on the cyl
column you would want to say n_distinct(mtcars$cyl)
or n_distinct(mtcars[["cyl"]])
(equivalent).
> distinct(mtcars, cyl)
cyl
1 6
2 4
3 8
> n_distinct(mtcars$cyl)
[1] 3
Upvotes: 3