Cassandra
Cassandra

Reputation: 137

filter() with multiple sets of conditions

I am trying to filter/subset observations based on whether they met a set of criteria. I.e. an observation of var1=2 and var2=2 should be filtered in, but an observation with var1=1 and var2=3 should be excluded. So an observation should be included if they meet any 1 row of criteria.

When I run a cross-tab on my data, the results show that there are some observations that should not be filtered.

table (df$var1, df$var2)

     1  2  3
  1  3  5  3
  2  4  5  4
  3  8 16 12
  4 12 15 16
  5 79 83 99

However, when I run my code (below), the number of observations has not changed when it should have.

clean <- df %>% filter(
    (var1 == 1 & var2 ==1)       |
    (var1 == 2 & var2 == 1 | 2 ) |    # var1=2 and var2=1 OK,  var1=2 and var2=2 also OK
    (var1 == 3 & var2 == 2)      |
    (var1 == 4 & var2 == 2 | 3)  |
    (var1 == 5 & var2 == 3)
  )

nrow(clean)

364

Upvotes: 0

Views: 28

Answers (1)

David Z
David Z

Reputation: 7041

Try whether this works:

clean <- df %>% filter(
    (var1 == 1 & var2 == 1)       |
    (var1 == 2 & var2 == 1 | var2 == 2 ) |  
    (var1 == 3 & var2 == 2)      |
    (var1 == 4 & var2 == 2 | var1 == 3)  |
    (var1 == 5 & var2 == 3)
  )

Upvotes: 1

Related Questions