Omry Atia
Omry Atia

Reputation: 2443

Filter_all with differing condition for each column

I have the following vector

vec1 = c(0.001, 0.05, 0.003, 0.1)

and a data frame

df = data_frame( x = seq(0.001, 0.1, length.out = 10), y = seq(0.03, 0.07, length.out = 10), z = seq(0, 0.005, length.out = 10), w = seq(0.05, 0.25, length.out = 10))

I would like to filter df such that the output would contain the rows of df for which, in each column, the minimum value would be the corresponding value of vec1 - 0.05, and the maximum would be vec1 + 0.05.

So in this example, only the first 4 rows satisfy this condition (in x I allow -0.049 to 0.501 based on the first entry of vec1, in y I allow 0 to 0.1 based on the second entry, and so on).

I am sure this can be done with filter_all and (.), something along the lines of

filter_all(df, all_vars(. >= (vec1(.) - 0.05) &  . <= (vec1(.) + 0.05))))

But this doesn't work. What am I doing wrong?

Upvotes: 1

Views: 115

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388797

We can use mapply on the dataframe and pass it along with vec1 and check which of the values satisfy the criteria and select only those rows where all of the columns have TRUE value in it.

df[rowSums(mapply(function(x, y) x > (y-0.05) & x < (y+0.05),
                                           df, vec1))  == ncol(df), ]

#      x      y        z      w
#   <dbl>  <dbl>    <dbl>  <dbl>
#1 0.0120 0.0344 0.000556 0.0722
#2 0.0230 0.0389 0.00111  0.0944
#3 0.0340 0.0433 0.00167  0.117 
#4 0.0450 0.0478 0.00222  0.139 

Upvotes: 2

Related Questions