Reputation: 2489
I would like to use a vector with column names for a variety of step functions in the tidymodels
recipe package. My intuition was simply to use (the prep
and juice
just used here for illustration):
library(tidymodels)
library(modeldata)
data(biomass)
remove_vector <- c("oxygen","nitrogen")
test_recipe <- recipe(HHV ~ .,data = biomass) %>%
step_rm(remove_vector)
test_recipe %>%
prep %>%
juice %>%
head
But this returns the warning:
Note: Using an external vector in selections is ambiguous.
i Use `all_of(remove_vector)` instead of `remove_vector` to silence this message.
i See <https://tidyselect.r-lib.org/reference/faq-external-vector.html>.
This message is displayed once per session.
This, of course, concerns me (I want to make sure I code without coming across error messages), but I still get the outcome I desire.
However, when I follow the error message and use the following with all_of
:
test_recipe <- recipe(HHV ~ .,data = biomass) %>%
step_rm(all_of(remove_vector))
test_recipe %>%
prep %>%
juice %>%
head
I get the error message:
Error: Not all functions are allowed in step function selectors (e.g.
all_of
). See ?selections.
In the ?selections
, I don't seem to find reference to the exact (seemingly simple) problem that I have.
Any ideas? Many thanks!
Upvotes: 0
Views: 1249
Reputation: 1367
If you use quasiquotation you won't get a warning:
library(tidymodels)
library(modeldata)
data(biomass)
remove_vector <- c("oxygen", "nitrogen")
test_recipe <- recipe(HHV ~ .,data = biomass) %>%
step_rm(!!!syms(remove_vector))
test_recipe %>%
prep %>%
juice %>%
head
More on the warning. It can happen that you name vector the same as one of your column names. For example:
oxygen <- c("oxygen","nitrogen")
test_recipe <- recipe(HHV ~ .,data = biomass) %>%
step_rm(oxygen)
This will remove only oxygen
column. However, if you use !!!syms(oxygen)
, both columns will be removed.
Upvotes: 3