abalter
abalter

Reputation: 10393

Apply a vector of filters based on a string (or vector of strings) in dplyr

R and the tidyverse have some extremely powerful but equally mysterious methods for turning strings into actionable expressions. I feel like one needs to be an expert to really understand how to use them.

NOTE: this question differs from this one in that I specifically ask about a vector (that is multiple) filter conditions. I demonstrate a solution for single filters that fails when I try multiple ways of extending it to multiple filters.

I want to do something along the lines of:

df = data.frame(A=1:10, B=1:10)
df %>% filter(A<3, B<5)

But where the filters are contained in either a string such as "A<3, B<5" or a character vector such as c("A<3", "B<5").

I can do

df %>% filter(eval(str2expression("A<3")))
#   A B
# 1 1 1
# 2 2 2

But this does not work:

df %>% filter(eval(str2expression("A<3, B<5")))
Error in str2expression("A<3, B<5") : <text>:1:4: unexpected ','
1: A<3,
       ^

These don't work either:

> df %>% filter(!!c(str2expression("A<3"), str2expression("B<5")))
Error: Argument 2 filter condition does not evaluate to a logical vector
> df %>% filter(!!!c(str2expression("A<3"), str2expression("B<5")))
Error: Can't splice an object of type `expression` because it is not a vector
Run `rlang::last_error()` to see where the error occurred.

Evaluating a vector of expressions from str2expression for some reason only applies the last expression:

> df %>% filter(eval(c(str2expression("A<3"), str2expression("B<5"))))
#   A B
# 1 1 1
# 2 2 2
# 3 3 3
# 4 4 4

Using a vector of evaluated expressions fails altogether:

> df %>% filter(!!!c(eval(str2expression("A<3")), eval(str2expression("B<5"))))
Error in eval(str2expression("A<3")) : object 'A' not found

I can do:

> df %>% filter(!!!c(expr(A<3), expr(B<5)))
#   A B
# 1 1 1
# 2 2 2

and this tells me that expr(A<3) is NOT the same thing as str2expression("A<3")

But that isn't starting from strings.

What to do?

Upvotes: 4

Views: 982

Answers (2)

abalter
abalter

Reputation: 10393

Learning from @Ronak Shah's answer, apparently, in dplyr I can use multiple conditions with a single & in filter instead of a comma. I don't understand this at all---it is not the same thing as an and logical:

> df %>% filter(A<3 & B<5)
  A B
1 1 1
2 2 2
> df %>% filter(A<3 && B<5)
    A  B
1   1  1
2   2  2
3   3  3
4   4  4
5   5  5
6   6  6
7   7  7
8   8  8
9   9  9
10 10 10

Nevertheless, the following does work:

> df %>% filter(eval(str2expression("A<3 & B<5")))
  A B
1 1 1
2 2 2
> df %>% filter(eval(str2expression("A<6 & B<5")))
  A B
1 1 1
2 2 2
3 3 3
4 4 4

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389175

You could use parse_exprs from rlang

library(dplyr)
expr <- c("A<3", "B<5")

filter(df, !!!rlang::parse_exprs(expr))

#  A B
#1 1 1
#2 2 2

Or you could combine the two expressions and then use it in eval

filter(df, eval(parse(text = paste0(expr, collapse = "&"))))

#  A B
#1 1 1
#2 2 2

Upvotes: 4

Related Questions