Reputation: 899
originally asked here
ggplot(data = bechdel, aes(x = domgross_2013)) +
geom_histogram(bins = 10, color="purple", fill="white") +
labs(title = "Domestic Growth of Movies", x = " Domestic Growth")
How come we are able to pass the column we would like to map to the x
value (domgross_2013)? It seems to be passed like a variable rather then a string.
This is different from this post because in order to reach that post you must know that on standard evaluation exists and is the cause of allowing you to pass an "undefined variable". I didn't know that this evaluation existed, and the explanation within that post is for people who have a much large R foreknowledge as well as understanding of what the "undefined variable" is
Upvotes: 1
Views: 1806
Reputation: 1492
For the second question, the reason you can pass x like a variable rather than a string is due to non-standard evaluation. Effectively, the function arguments are captured rather than being immediately evaluated, and then evaluated within the scope that they exist. For example, with the quote()
function, we can capture the input as-is, rather than looking for the value inside var
. Then, we can evaluate it inside another environment like the mtcars
data frame.
var <- quote(mpg)
> var
mpg
eval(var, envir = mtcars)
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
We can make a similar use of NSE within functions:
f <- function(x) {
input <- substitute(x)
print(input)
eval(input, envir = mtcars)
}
Here, we capture whatever was passed to the argument, and then execute it in the scope of the mtcars
data frame.
f(cyl)
cyl
[1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4
You can read more about this at the above link and here.
We can achieve the same results without NSE, but the way we call the functions will differ. In this case, arguments will be immediately evaluated and you will get an object not found error if you pass an undefined variable to the function.
f <- function(x) {
print(x)
mtcars[[x]]
}
To use this function, mpg
must be passed as a string.
f("mpg")
[1] "mpg"
[1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2 10.4
[16] 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4 15.8 19.7
[31] 15.0 21.4
You can see the results are identical to the first example, but in this case mpg
is a string rather than a captured expression. The second line of the function can be interpreted as mtcars[["mpg"]]
. Trying to use NSE with this function will result in an error:
f(mpg)
Error in print(x) : object 'mpg' not found
Upvotes: 3
Reputation: 145755
This is called non-standard evaluation (NSE). It can be a nicer interface to not need to use strings or data_frame$column_name
or other, longer, syntax. It requires special handling in the way the function is written. The Advanced R book has a chapter on non-standard evaluation, which is a good place to dig in to the mechanics of it. I'll quote the outline of the chapter here, to give an idea of what is covered, and to make a point that it's too complex of a topic to explain well in a single answer on Stack Overflow - a full chapter of a book is much more appropriate.
Outline
- Capturing expressions teaches you how to capture unevaluated expressions using
substitute()
.- Non-standard evaluation shows you how
subset()
works by combiningsubstitute()
witheval()
to allow you to succinctly select rows from a data frame.- Scoping issues discusses scoping issues specific to NSE, and will show you how to resolve them.
- Calling from another function shows why every function that uses NSE should have an escape hatch, a version that uses regular evaluation.
- Substitute teaches you how to use
substitute()
to work with functions that don’t have an escape hatch.- The downsides finishes off the chapter with a discussion of the downsides of NSE.
It's worth noting that non-standard evaluation is shows up even in base R, although its heaviest use seems to be in packages like dplyr
, data.table
, and ggplot2
.
## Examples of NSE in base R:
## library() has non-standard evaluation optionally
library(ggplot2) # this works even though `ggplot2` isn't an object
library("ggplot2") # also works with standard evaluation
## by contrast, install.packages does not allow NSE
install.packages(ggplot2) ## throws an error
# Error in install.packages : object 'ggplot2' not found
install.packages("ggplot2") ## quotes are needed here
## subset() uses NSE on column names,
## even letting you use `:` to choose consecutive columns
subset(mtcars, mpg > 22, select = mpg:hp)
## with() is a wrapper that allows non-standard evaluation inside it
with(mtcars, mpg / wt + disp)
Upvotes: 3
Reputation: 44788
The way R passes arguments is to pass the expression to the function. Most functions will evaluate it (this happens automatically when you reference the variable in a normal way), but it is also possible to access the expression itself, and that's what ggplot2
functions do in a lot of situations. As others have said, this is called "non-standard evaluation" or NSE.
The usual way to access the expression is with the substitute()
function. For example,
f <- function(x) substitute(x)
f(y + z)
#> y + z
Created on 2022-01-19 by the reprex package (v2.0.1)
This works even if variables y
and z
don't exist, because that function doesn't ever do standard evaluation on the x
argument.
This is also used by R in lazy evaluation. It won't evaluate an argument until it needs the value, so sometimes you get surprising results, because the value may have changed between the time you called the function and the time you evaluate its arguments.
Upvotes: 0