Reputation: 45
What I am trying to do
I am trying to write a function that returns the names of certain variables of a dataset. For a test tibble test <- tibble(x1 = 1:3, x2=2:4, x3=3:5, x4=4:6)
, I want a function
assign_predictors_argument <- function(dataset, outcome, predictors) {
...
}
such that:
predictors
is not defined, predictors
will be set to all variables in dataset
apart from outcome
. E.g. assign_predictors_argument(test, x1)
will return c(x2, x3, x4)
.predictors
is defined, will return that value. E.g. assign_predictors_argument(test, x1, c(x2, x3))
will return c(x2, x3)
.What I have tried
assign_predictors_argument <- function(dataset, outcome, predictors) {
if(missing(predictors)) {
predictors <- dataset %>%
dplyr::select( -{{ outcome }} ) %>%
names()
}
predictors
}
What went wrong
Case 1: predictors argument missing
assign_predictors_argument(test, x1)
gives the result "x2" "x3" "x4"
. However, I want this to return c(x2,x3, x4)
.
How do I convert this character vector to a form like the input?
Case 2: predictors argument defined
assign_predictors_argument(test, x1, c(x2, x3))
gives
Error in assign_predictors_argument(test, x1, x2) :
object 'x2' not found
It appears that the last line of the function tries to evaluate and return predictors
. As x3 is not defined in the environment, this brings an error.
I have tried a) changing the final line to {{predictors}}
as well as b) changing missing(predictors)
to is.null(predictors)
and putting in a default predictors = NULL
(following this). Neither have worked.
How can I return the value of predictors
without either a) changing its form or b) evaluating it?
Upvotes: 3
Views: 517
Reputation: 13691
You were close:
assign_predictors_argument <- function(dataset, outcome, predictors) {
if(missing(predictors)) {
dataset %>%
dplyr::select( -{{ outcome }} ) %>%
names() %>%
{rlang::expr( c(!!!syms(.)) )}
}
else rlang::enexpr(predictors)
}
assign_predictors_argument(test, x1)
# c(x2, x3, x4)
assign_predictors_argument(test, x1, c(x2, x3))
# c(x2, x3)
In the above, rlang::expr()
constructs the expression that you want by 1) converting names to symbols with syms()
and 2) splicing them together inside the c(...)
expression with the unquote-splice operator !!!
.
For the second portion, you can simply capture the expression supplied by the user with rlang::enexpr()
.
Upvotes: 2
Reputation: 173793
You say you want to return something like c(x2, x3, x4)
. Let's first be clear what this object is. It is an unevaluated call
to the function c
. It is not a vector of names. You will be able to use it in tidy evaluation, but it will require the !!
operator.
This is quite tricky to achieve. You need to capture the predictors
argument and ensure it is either a single variable name or a call to c
. Any other expression passed to predictors
should probably throw an error.
If predictors
is missing and you are getting the column names as characters, then you must convert these to names with as.name
and stick them in a c
call. If predictors
is a single variable, it must be returned unevaluated. If it is a c
call, it should also be returned unevaluated. Otherwise an error is thrown.
So the function might look something like this:
assign_predictors_argument <- function(dataset, outcome, predictors) {
if(missing(predictors)) {
predictors <- dataset %>%
dplyr::select( -{{ outcome }} ) %>%
names() %>%
sapply(as.name, USE.NAMES = FALSE)
predictors <- as.call(c(quote(c), predictors))
} else {
predictors <- as.list(match.call())$predictors
if(is.call(predictors))
{
f_name <- as.list(predictors)[[1]]
if(as.character(substitute(f_name)) != "c")
stop("'predictors' must be either a single variable or vector of names")
}
}
predictors
}
So let's test it out:
test <- dplyr::tibble(x1 = 1:3, x2 = 2:4, x3 = 3:5, x4 = 4:6)
# Test with missing predictors
assign_predictors_argument(test, x1)
#> c(x2, x3, x4)
# Test with single predictor
assign_predictors_argument(test, x1, x2)
#> x2
# Test with multiple predictors
assign_predictors_argument(test, x1, c(x3, x4))
#> c(x3, x4)
# Test with call other than call to c
assign_predictors_argument(test, x1, as.name("x3"))
#> Error in assign_predictors_argument(test, x1, as.name("x3")):
#> 'predictors' must be either a single variable or vector of names
This all looks correct. So to use it, we might do something like this:
vars <- assign_predictors_argument(test, x1, c(x2, x4))
vars
#> c(x2, x4)
test %>% select(!!vars)
#> # A tibble: 3 x 2
#> x2 x4
#> <int> <int>
#> 1 2 4
#> 2 3 5
#> 3 4 6
Created on 2020-07-10 by the reprex package (v0.3.0)
Upvotes: 1