Reputation: 4889
I want to write a custom function that can take bare
and "string"
inputs, and can handle both functions with and without the formula interface.
# setup
set.seed(123)
library(tidyverse)
# custom function
foo <- function(data, x, y) {
# function without formula
print(table(data %>% dplyr::pull({{ x }}), data %>% dplyr::pull({{ y }})))
# function with formula
print(
broom::tidy(stats::t.test(
formula = rlang::new_formula({{ rlang::ensym(y) }}, {{ rlang::ensym(x) }}),
data = data
))
)
}
works for both functions with and without formula interface
foo(mtcars, am, cyl)
#>
#> 4 6 8
#> 0 3 4 12
#> 1 8 3 2
#> # A tibble: 1 x 10
#> estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1.87 6.95 5.08 3.35 0.00246 25.9 0.724 3.02
#> # ... with 2 more variables: method <chr>, alternative <chr>
works for both functions with and without formula interface
foo(mtcars, "am", "cyl")
#>
#> 4 6 8
#> 0 3 4 12
#> 1 8 3 2
#> # A tibble: 1 x 10
#> estimate estimate1 estimate2 statistic p.value parameter conf.low conf.high
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1.87 6.95 5.08 3.35 0.00246 25.9 0.724 3.02
#> # ... with 2 more variables: method <chr>, alternative <chr>
works only for functions without the formula interface
foo(mtcars, colnames(mtcars)[9], colnames(mtcars)[2])
#>
#> 4 6 8
#> 0 3 4 12
#> 1 8 3 2
#> Error: Only strings can be converted to symbols
#> Backtrace:
#> x
#> 1. \-global::foo(mtcars, colnames(mtcars)[9], colnames(mtcars)[2])
#> 2. +-base::print(...)
#> 3. +-broom::tidy(...)
#> 4. +-stats::t.test(...)
#> 5. +-rlang::new_formula(...)
#> 6. \-rlang::ensym(y)
How can I modify the original function so that it will work with all the above-mentioned ways of entering the inputs and for both kinds of functions used?
Upvotes: 0
Views: 331
Reputation: 13721
I have to agree with @MrFlick and others about inherent ambiguity when mixing standard and non-standard evaluation. (I also pointed this out in your similar question from a while ago.)
However, one can argue that dplyr::select()
works with symbols, strings and expressions of the form colnames(.)[.]
. If you absolutely must have the same interface, then you can leverage tidyselect to resolve your inputs:
library( rlang )
library( tidyselect )
ttest <- function(data, x, y) {
## Identify locations of x and y in data, get column names as symbols
s <- eval_select( expr(c({{x}},{{y}})), data ) %>% names %>% syms
## Use the corresponding symbols to build the formula by hand
broom::tidy(stats::t.test(
formula = new_formula( s[[2]], s[[1]] ),
data = data
))
}
## All three now work
ttest( mtcars, am, cyl )
ttest( mtcars, "am", "cyl" )
ttest( mtcars, colnames(mtcars)[9], colnames(mtcars)[2] )
Upvotes: 3
Reputation: 206516
The nice philosophy of rlang
is that you get to control when you want values to be evaluated via the !!
and {{}}
operators. You seem to want to make a function that takes strings, symbols, and (possibly evaluated) expressions all in the same parameter. Using symbols or bare strings is actually easy with ensym
but also wanting to allow for code like colnames(mtcars)[9]
that has to be evaulated before returning a string is the problem. This potentially can be quite confusing. For example, what's the behavior you expect when you run the following?
am <- 'disp'
cyl <- 'gear'
foo(mtcars, am, cyl)
You could write a helper function if you want to assume all "calls" should be evaluated but symbols and literals should not. Here's a "cleaner" function
clean_quo <- function(x) {
if (rlang::quo_is_call(x)) {
x <- rlang::eval_tidy(x)
} else if (!rlang::quo_is_symbolic(x)) {
x <- rlang::quo_get_expr(x)
}
if (is.character(x)) x <- rlang::sym(x)
if (!rlang::is_quosure(x)) x <- rlang::new_quosure(x)
x
}
and you could use that in your function with
foo <- function(data, x, y) {
x <- clean_quo(rlang::enquo(x))
y <- clean_quo(rlang::enquo(y))
# function without formula
print(table(data %>% dplyr::pull(!!x), data %>% dplyr::pull(!!y)))
# function with formula
print(
broom::tidy(stats::t.test(
formula = rlang::new_formula(rlang::quo_get_expr(y), rlang::quo_get_expr(x)),
data = data
))
)
}
Doing so will allow all these to return the same values
foo(mtcars, am, cyl)
foo(mtcars, "am", "cyl")
foo(mtcars, colnames(mtcars)[9], colnames(mtcars)[2])
But you are probably just delaying possible other problems. I would not recommend over-interpreting user intentions with this kind of code. That's why it's better to explicitly allow them to un-escape themselves. Perhaps provide two different versions of the function that can be used with parameter that require evaluation and those that do not.
Upvotes: 3