Reputation: 1009
I have a list of custom filters that I need to subset my dataframe by. For example, for mtcars I have this list:
filters=c(mpg>15, wt<2, carb>2 & am==0)
I want to pass this list to a function containing dplyr/pipes
fmean <- function(filter_x) mtcars %>% filter(filter_x) %>% summarise(mean(disp))
My expected output after passing the list to the fmean is:
subset mean(disp)
mpg>15 192
wt<2 80.2
carb>2 & am==0 324
How to obtain the above output?
EDIT: found a tidyverse solution thanks to @alistaire, and others who replied here:
library(tidyverse)
filters <- c("mpg > 15", "wt < 2", "carb > 2 & am==0")
fmean <- function(filter_x) { mtcars %>%
filter_(filter_x) %>%
summarise(mean(disp)) %>%
mutate(subset=filter_x) %>%
select(subset, everything()) }
filters %>% map_df(fmean)
output:
subset mean(disp)
mpg>15 192.3
wt<2 80.2
carb>2 & am==0 324.5
Upvotes: 3
Views: 1827
Reputation: 19544
In base R:
fmean <- function(filter_x) data.frame(
subset=filter_x,
do.call(rbind,
lapply(filter_x, function(x)
mtcars %>% filter_(x) %>% summarise(mean(disp)))))
fmean(filters)
subset mean.disp.
1 mpg > 15 192.3115
2 wt < 2 80.2250
3 carb > 2 & am==0 324.4600
Upvotes: 1
Reputation: 6372
data.table way, with eval(parse())
library(data.table)
mt_dt <- data.table(mtcars)
filters <- c("mpg > 15", "wt < 2", "carb > 2 & am==0")
out <- sapply(filters, function(x){mt_dt[eval(parse(text = x)), mean(disp)]})
out
# mpg > 15 wt < 2 carb > 2 & am==0
# 192.3115 80.2250 324.4600
We are looping over our filters, and for each filter we subset and apply our aggregation function
This results in a named vector, which is quite flexible to work with. If you prefer a table, you can use:
data.table(subset = names(out), `mean(disp)` = out)
# subset mean(disp)
# 1: mpg > 15 192.3115
# 2: wt < 2 80.2250
# 3: carb > 2 & am==0 324.4600
Upvotes: 1
Reputation: 606
The most straightforward way to accomplish this is probably to use the purrr
package which, along with dplyr
, is part of the tidyverse
package:
library(tidyverse)
filters <- c("mpg > 15", "wt < 2", "carb > 2 & am==0")
fmean <- function(filter_x) {
# Create list of means
means <- filter_x %>%
map(~ mtcars %>% filter_(.dots = .x) %>% summarise(mean(disp)))
# Create tibble from means
tibble(subset = filter_x, means = unlist(means))
}
fmean(filters)
Additionally, you want to use filter_
instead of filter
which allows you to pass the subsetting conditions as strings rather than as unquoted text.
Upvotes: 4