Reputation: 323
recently I am thinking about that can 2 filter functions combine into 1 filter?
For example
library(tidyverse)
library(nycflights13)
a <- flights %>% filter(month == 1)
b <- flights %>% filter(day == 1)
Now, a is filtering flights that in Jan, and b on 1st. I want to combine the filter a & b, which means I want to find that flights that are 1st Jan. Except for
flights %>% filter(month == 1 & day == 1)
Can we find a and b first, and combine 2 filters in 1, because sometimes I need to filter lots of information, and I think if I can filter it separately first will be easier. Thank you!
Upvotes: 0
Views: 791
Reputation: 269694
Is this what you are looking for where the dot notation defines a function:
library(dplyr)
a <- . %>% filter(cyl == 4) # or a <- function(x) filter(x, cyl == 4)
b <- . %>% filter(gear == 4)
mtcars %>% a %>% b
giving those rows for which cyl
is 4 and gear
is 4:
mpg cyl disp hp drat wt qsec vs am gear carb
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
Also note that filtering successively as we do here is more efficient than filtering each condition separately and then combining them. In the first case at each step we are reducing the size of the data but in the second we are dealing with the full data at each filtering step.
Upvotes: 2
Reputation: 389047
You can use dplyr::intersect
.
a1 <- flights %>% filter(month == 1)
a2 <- flights %>% filter(day == 1)
result1 <- dplyr::intersect(a1, a2)
nrow(result1)
#[1] 842
This is same as doing :
nrow(flights %>% filter(month == 1 & day == 1))
#[1] 842
If you have more than 2 conditions use it with Reduce
:
a3 <- flights %>% filter(dep_delay == 2)
result2 <- Reduce(dplyr::intersect, list(a1, a2, a3))
nrow(result2)
#[1] 23
is same as
nrow(flights %>% filter(month == 1 & day == 1 & dep_delay == 2))
#[1] 23
Upvotes: 1