Ryan
Ryan

Reputation: 490

How does R ggplot2 get the column names via aes?

I understand how to use aes, but I don't understand the programmatic paradigm.

When I use ggplot, assuming I have a data.frame with column names "animal" and "weight", I can do the following.

ggplot(df, aes(x=weight)) + facet_grid(~animal) + geom_histogram()

What I don't understand is that weight and animal are not supposed to be strings, they are just typed out as is. How is it I can do that? It should be something like this instead:

ggplot(df, aes(x='weight')) + facet_grid('~animal') + geom_histogram()

I don't "declare" weight or animal as vectors anywhere? This seems to be... really unusual? Is this like a macro or something where it gets aes "whole," looks into df for its column names, and then fills in the gaps where it sees those variable names in aes?

I guess what I would like is to see some similar function in R which can take variables which are not declared in the scope, and the name of this feature, so I can read further and maybe implement my own similar functions.

Upvotes: 3

Views: 769

Answers (1)

MrFlick
MrFlick

Reputation: 206253

In R this is called non-standard evaluation. There is a chapter on non-standard evaluation in R in the Advanced R book available free online. Basically R can look at the the call stack to see the symbol that was passed to the function rather than just the value that symbol points to. It's used a lot in base R. And it's used in a slightly different way in the tidyverse which has a formal class called a quosure to make this stuff easier to work with.

These methods are great for interactive programming. They save keystrokes and clutter, but if you make functions that are too dependent on that function, they become difficult to script or include in other functions.

The formula syntax (the one with the ~) probably the safest and more programatic way to work with symbols. It captures symbols that can be later evaluated in the context of a data.frame with functions like model.frame(). And there are build in functions to help manipulate formulas like update() and reformulate.

And since you were explicitly interested in the aes() call, you can get the source code for any function in R just by typing it's name without the quotes. With ggplot2_2.2.1, the function looks like this

aes
# function (x, y, ...) 
# {
#     aes <- structure(as.list(match.call()[-1]), class = "uneval")
#     rename_aes(aes)
# }
# <environment: namespace:ggplot2>

The newest version of ggplot uses different rlang methods to be more consistent with other tidyverse libraries so it looks a bit different.

Upvotes: 9

Related Questions