hsl
hsl

Reputation: 681

How to filter dataframe with multiple conditions?

I have this dataframe that I'll like to subset (if possible, with dplyr or base R functions):

df <- data.frame(x = c(1,1,1,2,2,2), y = c(30,10,8,10,18,5))

x  y
1 30
1 10
1  8
2 10
2 18
2  5

Assuming x are factors (so 2 conditions/levels), how can I subset/filter this dataframe so that I get only df$y values that are greater than 15 for df$x == 1, and df$y values that are greater than 5 for df$x == 2?

This is what I'd like to get:

df2 <- data.frame(x = c(1,2,2), y = c(30,10,18))

x y
1 30
2 10
2 18

Appreciate any help! Thanks!

Upvotes: 5

Views: 12153

Answers (2)

akrun
akrun

Reputation: 886938

If you have several 'x' groups, one option would be to use mapply. We split the 'y' using 'x' as grouping variable, create the vector of values to compare against (c(15,5)) and use mapply to get the logical index for subsetting the 'df'.

df[unlist(mapply('>', split(df$y, df$x), c(15,5))),]
#  x  y
#1 1 30
#4 2 10
#5 2 18

Upvotes: 2

Mamoun Benghezal
Mamoun Benghezal

Reputation: 5314

you can try this

with(df, df[ (x==1 & y>15) | (x==2 & y>5), ])
  x  y
1 1 30
4 2 10
5 2 18

or with dplyr

library(dplyr)
filter(df, (x==1 & y>15) | (x==2 & y>5))

Upvotes: 1

Related Questions