MCS
MCS

Reputation: 1101

Use a variable name as function argument

I would like to write a unique function to (among other things) filter a database for different variables at different threshold. I found a way to indicate the variable I want to filter on in the function itself, but not sure is the best way.

How should I go about it?

example_db <- data.frame(name=c("A","B","C"), 
                         value_1=c(1,2,3), 
                         value_2=c(2,3,1))

advanced_filter <- function(data,variable,limit){
  
  require(dplyr)
  data <- data %>% 
    dplyr::filter(variabe>limit) 
  
  return(data)
}

Expected result:

advanced_filter(example_db,value_1,2)

name value_1 value_2
1    C       3       1

My attempt:

advance_filter <- function(data,variable,limit){
  require(dplyr)
  f <- paste(variable, ">",  limit)
  data <- data %>% 
    dplyr::filter_(f) 
  
  return(data)
}


advance_filter(example_db,"value_1",2)

Upvotes: 1

Views: 2881

Answers (3)

hello_friend
hello_friend

Reputation: 5788

If you want to be able to use the vector name as a string, either fully qualified - or just as a symbol you can use the following Base R solution:

# Function:
advanced_filter <- function(data, variable, limit) {
  if(is.character(substitute(variable))){
    variable_str <- variable
  }else{
    variable_str <- gsub(".*\\$", "", deparse(substitute(variable)))
  }
  res <- data[data[,variable_str] > limit,]
  return(res)
}
# Application:
advanced_filter(example_db, "value_1", 2)
advanced_filter(example_db, value_1, 2)
advanced_filter(example_db, example_db$value_1, 2)

Upvotes: 1

DaveArmstrong
DaveArmstrong

Reputation: 21937

Allan Cameron's answer is obviously correct and only requires base R, just for posterity's sake, here's the tidy version.

example_db <- data.frame(name=c("A","B","C"), 
                         value_1=c(1,2,3), 
                         value_2=c(2,3,1))



advanced_filter <- function(data,variable,limit){
  require(dplyr)
  vbl <- enquo(variable)
  data %>% 
    dplyr::filter(!!vbl > limit) 
}

advanced_filter(example_db,value_1,2)
#> Loading required package: dplyr
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#>   name value_1 value_2
#> 1    C       3       1

Created on 2022-01-28 by the reprex package (v2.0.1)

Or, following @TimTeaFan's comment below:

example_db <- data.frame(name=c("A","B","C"), 
                         value_1=c(1,2,3), 
                         value_2=c(2,3,1))



advanced_filter <- function(data,variable,limit){
  require(dplyr)
  data %>% 
    dplyr::filter({{variable}} > limit) 
}

advanced_filter(example_db,value_1,2)
#> Loading required package: dplyr
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
#>   name value_1 value_2
#> 1    C       3       1

Created on 2022-01-28 by the reprex package (v2.0.1)

Upvotes: 3

Allan Cameron
Allan Cameron

Reputation: 173803

Perhaps you are making this more complicated than it needs to be:

advanced_filter <- function(data, variable, limit) {
  data[data[variable] > limit,]
}

advanced_filter(example_db, "value_1", 2)
#>   name value_1 value_2
#> 3    C       3       1

Created on 2022-01-28 by the reprex package (v2.0.1)

Upvotes: 4

Related Questions