Reputation: 1025
I want to use variable names as strings in functions of dplyr
. See the example below:
df <- data.frame(
color = c("blue", "black", "blue", "blue", "black"),
value = 1:5)
filter(df, color == "blue")
It works perfectly, but I would like to refer to color
by string, something like this:
var <- "color"
filter(df, this_probably_should_be_a_function(var) == "blue").
I would be happy, to do this by any means and super-happy to make use of easy-to-read dplyr
syntax.
Upvotes: 48
Views: 43718
Reputation: 886978
In the newer versions, we can create the variables as quoted and then unquote (UQ
or !!
) for evaluation
var <- quo(color)
filter(df, UQ(var) == "blue")
# color value
#1 blue 1
#2 blue 3
#3 blue 4
Due to operator precedence, we may require ()
to wrap around !!
filter(df, (!!var) == "blue")
# color value
#1 blue 1
#2 blue 3
#3 blue 4
With new version, ||
have higher precedence, so
filter(df, !! var == "blue")
should work (as @Moody_Mudskipper commented)
We may also use:
filter(df, get(var, envir=as.environment(df))=="blue")
#color value
#1 blue 1
#2 blue 3
#3 blue 4
EDIT: Rearranged the order of solutions
Upvotes: 37
Reputation: 226097
rlang
version >= 0.4.0.data
is now recognized as a way to refer to the parent data frame, so reference by string works as follows:
var <- "color"
filter(df, .data[[var]] == "blue")
If the variable is already a symbol, then {{}}
will dereference it properly
example 1:
var <- quo(color)
filter(df, {{var}} == "blue")
or more realistically
f <- function(v) {
filter(df, {{v}} == "blue")
}
f(color) # Curly-curly provides automatic NSE support
More reading and examples are provided in the Programming with dplyr article/vignette.
Upvotes: 16
Reputation: 1268
This question was posted 6 years ago. dplyr
is now up to version 1.0.2. Yet this is still a great discussion and helped me immensely with with my problem. I wanted to be able to construct filters from columns, operators, and values that are all specified by variables in memory. Oh, and for an indeterminate number of filters!
Consider the following list where I specify the column, the operator, and the value for two filters:
myFilters =
list(
list(var = "color", op = "%in%", val = "blue"),
list(var = "value", op = "<=", val = 3)
)
From this list, I want to run:
dplyr::filter(color %in% "blue", value <= 3)
We can use lapply
on the list
above to create a list
of call
objects, force evaluation of the calls using the !!!
operator, and pass that to filter
:
library(dplyr)
df <- data.frame(
color = c("blue", "black", "blue", "blue", "black"),
value = 1:5)
result =
lapply(myFilters, function(x) call(x$op, as.name(x$var), x$val)) %>%
{filter(df, !!!.)}
...and Shazam!
> result
color value
1 blue 1
2 blue 3
That's a lot to absorb, so if it isn't immediately apparent what's happening, let me unpack it a bit. Consider:
var = "color"
op = "%in%"
val = "blue"
I'd want to be able to run:
filter(df, color %in% "blue")
and if I also have:
var2 = "value"
op2 = "<="
val2 = 3
I might want to be able to get:
filter(df, color %in% "blue", value <= 3)
The solution uses call
s, which are unevaluated expressions. (See Hadley's Advanced R book) Basically, make a list of call
object from variables, and then force evaluation of the calls using the !!!
operator when calling dplyr::filter
.
call1 = call(op, as.name(var), val)
Here is the value of call1
:
> call1
color %in% "blue"
Let's create another call
:
call2 = call(op2, as.name(var2), val2)
Put them in list:
calls = list(call1, call2)
and use !!!
to evaluate the list of calls prior to sending them to filter
:
result = filter(df, !!!calls)
Upvotes: 3
Reputation: 3568
An update. The new dplyr1.0.0
has some fantastic new functionality that makes solving these sorts of problems far easier. You can read about it in the 'programming' vignette accompanying the new package.
Basically the .data[[foo]]
function allows you to pass strings into functions more easily.
So you can do this
filtFunct <- function(d, var, crit) {
filter(d, .data[[var]] %in% crit)
}
filtFunct(df, "value", c(2,4))
# color value
# 1 black 2
# 2 blue 4
filtFunct(df, "color", "blue")
# color value
# 1 blue 1
# 2 blue 3
# 3 blue 4
Upvotes: 5
Reputation: 3568
Several of the solutions above did not work for me. Now there is the as.symbol
function, which we wrap in !!
. Seems a bit simpler, sort of.
set.seed(123)
df <- data.frame(
color = c("blue", "black", "blue", "blue", "black"),
shape = c("round", "round", "square", "round", "square"),
value = 1:5)
Now enter the variable as a string into the dplyr functions by passing it through as.symbol()
and !!
var <- "color"
filter(df, !!as.symbol(var) == "blue")
# color shape value
# 1 blue round 1
# 2 blue square 3
# 3 blue round 4
var <- "shape"
df %>% group_by(!!as.symbol(var)) %>% summarise(m = mean(value))
# shape m
# <fct> <dbl>
# 1 round 2.33
# 2 square 4
Upvotes: 7
Reputation: 11431
dplyr
versions [0.3 - 0.7) (? - June 2017)(For more recent dplyr
versions, please see other answers to this question)
As of dplyr 0.3
every dplyr
function using non standard evaluation (NSE, see release post and vignette) has a standard evaluation (SE) twin ending in an underscore. These can be used for passing variables. For filter
it will be filter_
. Using filter_
you may pass the logical condition as a string.
filter_(df, "color=='blue'")
# color value
# 1 blue 1
# 2 blue 3
# 3 blue 4
Construing the string with the logical condition is of course straighforward
l <- paste(var, "==", "'blue'")
filter_(df, l)
Upvotes: 27
Reputation: 2800
As of dplyr 0.7, some things have changed again.
library(dplyr)
df <- data.frame(
color = c("blue", "black", "blue", "blue", "black"),
value = 1:5)
filter(df, color == "blue")
# it was already possible to use a variable for the value
val <- 'blue'
filter(df, color == val)
# As of dplyr 0.7, new functions were introduced to simplify the situation
col_name <- quo(color) # captures the current environment
df %>% filter((!!col_name) == val)
# Remember to use enquo within a function
filter_col <- function(df, col_name, val){
col_name <- enquo(col_name) # captures the environment in which the function was called
df %>% filter((!!col_name) == val)
}
filter_col(df, color, 'blue')
More general cases are explained in the dplyr programming vignette.
Upvotes: 17
Reputation: 2074
Here is one way to do it using the sym()
function in the rlang
package:
library(dplyr)
df <- data.frame(
main_color = c("blue", "black", "blue", "blue", "black"),
secondary_color = c("red", "green", "black", "black", "red"),
value = 1:5,
stringsAsFactors=FALSE
)
filter_with_quoted_text <- function(column_string, value) {
col_name <- rlang::sym(column_string)
df1 <- df %>%
filter(UQ(col_name) == UQ(value))
df1
}
filter_with_quoted_text("main_color", "blue")
filter_with_quoted_text("secondary_color", "red")
Upvotes: 5
Reputation: 54237
Often asked, but still no easy support afaik. However, with regards to this posting:
eval(substitute(filter(df, var == "blue"),
list(var = as.name(var))))
# color value
# 1 blue 1
# 2 blue 3
# 3 blue 4
Upvotes: 7