Reputation: 27
I have two requirements
I'm Learning since last 2 weeks. Watching YouTube videos, Referring Stackoverflow and other websites, so not much. Please do refer if any material or courses.
so answer to my first question i found here (Find duplicated elements with dplyr)
# All duplicated elements
mtcars %>%
filter(carb %in% unique(.[["carb"]][duplicated(.[["carb"]])]))
So i want opposite of this
Thanks
P.S. I have non technical background. I went through couple of questions and answers here, so i might have found the answer or needed some of tweaks and i totally ignored that
Upvotes: 0
Views: 214
Reputation: 546153
As you probably realised, unique
and duplicated
don’t quite what you need, because they essentially cause the retention of all distinct values, and just collapse “multiple copies” of such values.
For your first question, you can group_by
the column that you’re interested in, and then retain just those groups (via filter
) which have more than one row:
mtcars %>%
group_by(mpg) %>%
filter(length(mpg) > 1) %>%
ungroup()
This example selects all rows for which the mpg
value is duplicated. This works because, when applied to groups, dplyr operations such as filter
work on each group individually. This means that length(mpg)
in the above code will return the length of the mpg
column vector of each group, separately.
To invert the logic, it’s enough to invert the filtering condition:
mtcars %>%
group_by(mpg) %>%
filter(length(mpg) == 1) %>%
ungroup()
Upvotes: 2