Rafael Zanzoori
Rafael Zanzoori

Reputation: 591

Using "any" operator in dplyr filter

I'm trying to use the "any" operator inside "filter" in dplyr package like this:

 library(tidyverse)

 iris %>%
   as_tibble() %>%
   filter( any(Species == "setosa",
               Species == "versicolor") )

# A tibble: 150 x 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
 1          5.1         3.5          1.4         0.2 setosa 
 2          4.9         3            1.4         0.2 setosa 
 3          4.7         3.2          1.3         0.2 setosa 
 4          4.6         3.1          1.5         0.2 setosa 
 5          5           3.6          1.4         0.2 setosa 
 6          5.4         3.9          1.7         0.4 setosa 
 7          4.6         3.4          1.4         0.3 setosa 
 8          5           3.4          1.5         0.2 setosa 
 9          4.4         2.9          1.4         0.2 setosa 
10          4.9         3.1          1.5         0.1 setosa 
# ... with 140 more rows

for some reason the filter is ignored because iris contains 150 rows.

However when the "|" operator is used the correct number of rows is returned:

 library(tidyverse)

 iris %>%
   as_tibble() %>%
   filter( Species == "setosa" | 
             Species == "versicolor" )

# A tibble: 100 x 5
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
          <dbl>       <dbl>        <dbl>       <dbl> <fct>  
 1          5.1         3.5          1.4         0.2 setosa 
 2          4.9         3            1.4         0.2 setosa 
 3          4.7         3.2          1.3         0.2 setosa 
 4          4.6         3.1          1.5         0.2 setosa 
 5          5           3.6          1.4         0.2 setosa 
 6          5.4         3.9          1.7         0.4 setosa 
 7          4.6         3.4          1.4         0.3 setosa 
 8          5           3.4          1.5         0.2 setosa 
 9          4.4         2.9          1.4         0.2 setosa 
10          4.9         3.1          1.5         0.1 setosa 
# ... with 90 more rows

Is it possible to make the code work using the "any" operator with dplyr filter?

Rafael

Upvotes: 3

Views: 5694

Answers (2)

Near Lin
Near Lin

Reputation: 109

You can get what you want by simply adding rowwise() before filter.

testdf <- data.frame(a = c(1:10), b = c(10:1))

testdf |>
  rowwise() |>
  filter(any(a == 5, b == 9))

Upvotes: 0

Konrad Rudolph
Konrad Rudolph

Reputation: 545865

What purpose does any serve in your code? I think you just want

… %>% filter(Species == "setosa" | Species == "versicolor")

Or

… %>% filter(Species %in% c("setosa", "versicolor"))

In either case, the expression inside filter returns a vector corresponding to the rows inside your data frame. By contrast, any returns a single value, either TRUE or FALSE so it will either filter all rows, or none.

Upvotes: 3

Related Questions