spidermarn
spidermarn

Reputation: 939

filter count distinct > 1

Hi I have df as below:

ID | Gender
1  | M
1  | F
2  | F
2  | F
2  | F
3  | M
3  | M
3  | F
4  | M
4  | M
4  | M

I'd like to distinct filter IDs which have more than 1 Gender (filter dirty data as can't have > 1 Gender per person) Results should be:

ID | Gender
1  | M
1  | F
3  | M
3  | F

How can I go about in R using dplyr?

Upvotes: 2

Views: 241

Answers (1)

Sotos
Sotos

Reputation: 51592

Using dplyr,

library(dplyr)

df %>% 
  group_by(ID) %>% 
  filter(n_distinct(Gender) > 1) %>% 
  distinct(Gender)

which gives,

# A tibble: 4 x 2
# Groups:   ID [2]
  Gender    ID
  <chr>  <int>
1 M          1
2 F          1
3 M          3
4 F          3

Upvotes: 3

Related Questions