user16597745
user16597745

Reputation: 37

Unique rows based on two logical conditions

I want my dataframe to return unique rows based on two logical conditions (OR not AND).

But when I ran this, df %>% group_by(sex) %>% distinct(state, education) %>% summarise(n=n()) I got deduplicated rows based on the two conditions joined by AND not OR.

Is there a way to get something like this df %>% group_by(sex) %>% distinct(state | education) %>% summarise(n=n()) so that the deduplicated rows will be joined by OR not AND?

Thank you.

Upvotes: 0

Views: 54

Answers (1)

Cettt
Cettt

Reputation: 11981

You can use tidyr::pivot_longer and then distinct afterwards:

df %>%
  pivot_longer(c(state, education), names_to = "type", values_to = "value")
  group_by(sex) %>%
  distinct(value) %>%
  summarise(n = n())

In this case, pivot_longer simply puts state and education into one column called value.

Upvotes: 1

Related Questions