Reputation: 1101
I would like to write a unique function to (among other things) filter a database for different variables at different threshold. I found a way to indicate the variable I want to filter on in the function itself, but not sure is the best way.
How should I go about it?
example_db <- data.frame(name=c("A","B","C"),
value_1=c(1,2,3),
value_2=c(2,3,1))
advanced_filter <- function(data,variable,limit){
require(dplyr)
data <- data %>%
dplyr::filter(variabe>limit)
return(data)
}
Expected result:
advanced_filter(example_db,value_1,2)
name value_1 value_2
1 C 3 1
My attempt:
advance_filter <- function(data,variable,limit){
require(dplyr)
f <- paste(variable, ">", limit)
data <- data %>%
dplyr::filter_(f)
return(data)
}
advance_filter(example_db,"value_1",2)
Upvotes: 1
Views: 2881
Reputation: 5788
If you want to be able to use the vector name as a string, either fully qualified - or just as a symbol you can use the following Base R solution:
# Function:
advanced_filter <- function(data, variable, limit) {
if(is.character(substitute(variable))){
variable_str <- variable
}else{
variable_str <- gsub(".*\\$", "", deparse(substitute(variable)))
}
res <- data[data[,variable_str] > limit,]
return(res)
}
# Application:
advanced_filter(example_db, "value_1", 2)
advanced_filter(example_db, value_1, 2)
advanced_filter(example_db, example_db$value_1, 2)
Upvotes: 1
Reputation: 21937
Allan Cameron's answer is obviously correct and only requires base R, just for posterity's sake, here's the tidy version.
example_db <- data.frame(name=c("A","B","C"),
value_1=c(1,2,3),
value_2=c(2,3,1))
advanced_filter <- function(data,variable,limit){
require(dplyr)
vbl <- enquo(variable)
data %>%
dplyr::filter(!!vbl > limit)
}
advanced_filter(example_db,value_1,2)
#> Loading required package: dplyr
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
#> name value_1 value_2
#> 1 C 3 1
Created on 2022-01-28 by the reprex package (v2.0.1)
Or, following @TimTeaFan's comment below:
example_db <- data.frame(name=c("A","B","C"),
value_1=c(1,2,3),
value_2=c(2,3,1))
advanced_filter <- function(data,variable,limit){
require(dplyr)
data %>%
dplyr::filter({{variable}} > limit)
}
advanced_filter(example_db,value_1,2)
#> Loading required package: dplyr
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
#> name value_1 value_2
#> 1 C 3 1
Created on 2022-01-28 by the reprex package (v2.0.1)
Upvotes: 3
Reputation: 173803
Perhaps you are making this more complicated than it needs to be:
advanced_filter <- function(data, variable, limit) {
data[data[variable] > limit,]
}
advanced_filter(example_db, "value_1", 2)
#> name value_1 value_2
#> 3 C 3 1
Created on 2022-01-28 by the reprex package (v2.0.1)
Upvotes: 4