Reputation: 37
I want my dataframe to return unique rows based on two logical conditions (OR not AND).
But when I ran this, df %>% group_by(sex) %>% distinct(state, education) %>% summarise(n=n())
I got deduplicated rows based on the two conditions joined by AND not OR.
Is there a way to get something like this df %>% group_by(sex) %>% distinct(state | education) %>% summarise(n=n())
so that the deduplicated rows will be joined by OR not AND?
Thank you.
Upvotes: 0
Views: 54
Reputation: 11981
You can use tidyr::pivot_longer
and then distinct
afterwards:
df %>%
pivot_longer(c(state, education), names_to = "type", values_to = "value")
group_by(sex) %>%
distinct(value) %>%
summarise(n = n())
In this case, pivot_longer
simply puts state and education into one column called value
.
Upvotes: 1