Reputation: 847
I am trying to write a function that allows a user to provide an unquoted argument to a function that will filter a dataset.
For simplicity i have used the iris
dataset and mocked up a function.
filter_function <- function(var_to_filter_on){
test_data <- iris %>%
filter(Species %in% {{var_to_filter_on}})
print(test_data)
}
This however returns:
Error in `filter()`:
ℹ In argument: `Species %in% virginica`.
Caused by error:
! object 'virginica' not found
I believe this is an issue with "non standard evaluation", but my understanding is not great. I thought that the {{}}
operator would allow me to pass "virginica" without having to quote it in the function call?
I'm also using %in%
because i want the user to have the flexibility to supply a vector of names to filter on.
How do i get this function to work?
Cheers
Upvotes: 0
Views: 102
Reputation: 269694
As others have mentioned unquoted names in R (as in subset
and transform
) and tidyverse normally refer to column names or components in a data frame or list and not to string constants. It would be less confusing for the caller and easier to implement if we just passed a quoted string or character vector of them. Also it makes it very easy to incorporate multiple strings passing a character vector.
One additional idea that gives most of the advantages of unquoted strings without the atypical usage is to allow the argument to optionally be passed as a quoted string or string vector or as a formula. A formula does not require quoting and we have the handy all.vars
function which R supplies to extract all the names from the formula as character strings.
This design also has the advantage that it is easy to use when programming. Suppose we did not know ahead of time which character string was to be passed and instead it was contained in a character vector variable. Then it would be possible to just pass that variable in an ordinary way as shown below.
With this aproach any of the 5 calls below will work.
library(dplyr)
# x is variable(s) to filter iris on as a char vec or formula
filter_function <- function(x) {
if (inherits(x, "formula")) x <- all.vars(x)
out <- iris %>% filter(Species %in% x)
print(out)
}
# any of these work:
filter_function(~ setosa)
filter_function(~ setosa + virginica)
filter_function("setosa")
filter_function(c("setosa", "virginica"))
species <- c("setosa", "virginica"); filter_function(species)
Upvotes: 4
Reputation: 1253
The {{
operator is meant to pass unquoted variable names to a function that uses tidy evaluation:
The embrace operator
{{
is used to create functions that call other data-masking functions. It transports a data-masked argument (an argument that can refer to columns of a data frame) from one function to another.
When you call filter_function(virginica)
, the symbol virginica
(not the literal string "virginica"
is injected into the expression Species %in% {{var_to_filter_on}}
, so you are effectively calling:
test_data <- iris %>%
filter(Species %in% virginica)
as you can see from the error message. Since virginica
is neither a variable in your environment nor a column in iris
, you get an error.
If you want to allow the user to provide the species epithet as symbol rather than a string, you'd need to capture the argument and then convert it to a string:
filter_function <- function(var_to_filter_on){
test_data <- iris %>%
filter(Species %in% rlang::as_string(rlang::enexpr(var_to_filter_on)))
print(test_data)
}
If you want to be able to supply multiple using c()
, then you'd have to extract the arguments from the call object using as.list(rlang::enexpr(var_to_filter_on))[-1]
, or pass them in as dotted arguments (...
) and use rlang::enexprs()
.
But, I think it makes more sense to use @Edward's suggestion and simply pass in a string. Unquoted symbols refer to variables, while the species epithet is data, which is represented with quoted strings. Mixing the two creates a non-idiomatic interface for your users.
Upvotes: 1