andrewH
andrewH

Reputation: 2321

How can one make visible the difference in the outputs of quote() and substitute()?

As applied to the same R code or objects, quote and substitute typically return different objects. How can one make this difference apparent?

is.identical <- function(X){
  out <- identical(quote(X), substitute(X))
  out
}

> tmc <- function(X){
   out <- list(typ = typeof(X), mod = mode(X), cls = class(X))
   out
 }

> df1 <- data.frame(a = 1, b = 2)

Here the printed output of quote and substitute are the same.

> quote(df1)
df1
> substitute(df1)
df1

And the structure of the two are the same.

> str(quote(df1))
 symbol df1
> str(substitute(df1))
 symbol df1

And the type, mode and class are all the same.

> tmc(quote(df1))
$typ
[1] "symbol"
$mod
[1] "name"
$cls
[1] "name"

> tmc(substitute(df1))
$typ
[1] "symbol"
$mod
[1] "name"
$cls
[1] "name"

And yet, the outputs are not the same.

> is.identical(df1)
[1] FALSE

Note that this question shows some inputs that cause the two functions to display different outputs. However, the outputs are different even when they appear the same, and are the same by most of the usual tests, as shown by the output of is.identical() above. What is this invisible difference, and how can I make it appear?

note on the tags: I am guessing that the Common LISP quote and the R quote are similar

Upvotes: 4

Views: 166

Answers (1)

joran
joran

Reputation: 173577

The reason is that the behavior of substitute() is different based on where you call it, or more precisely, what you are calling it on.

Understanding what will happen requires a very careful parsing of the (subtle) documentation for substitute(), specifically:

Substitution takes place by examining each component of the parse tree as follows: If it is not a bound symbol in env, it is unchanged. If it is a promise object, i.e., a formal argument to a function or explicitly created using delayedAssign(), the expression slot of the promise replaces the symbol. If it is an ordinary variable, its value is substituted, unless env is .GlobalEnv in which case the symbol is left unchanged.

So there are essentially three options.

In this case:

> df1 <- data.frame(a = 1, b = 2)
> identical(quote(df1),substitute(df1))
[1] TRUE

df1 is an "ordinary variable", but it is called in .GlobalEnv, since env argument defaults to the current evaluation environment. Hence we're in the very last case where the symbol, df1, is left unchanged and so it identical to the result of quote(df1).

In the context of the function:

is.identical <- function(X){
    out <- identical(quote(X), substitute(X))
    out
}

The important distinction is that now we're calling these functions on X, not df1. For most R users, this is a silly, trivial distinction, but when playing with subtle tools like substitute it becomes important. X is a formal argument of a function, so that implies we're in a different case of the documented behavior.

Specifically, it says that now "the expression slot of the promise replaces the symbol". We can see what this means if we debug() the function and examine the objects in the context of the function environment:

> debugonce(is.identical)
> is.identical(X = df1)
debugging in: is.identical(X = df1)
debug at #1: {
    out <- identical(quote(X), substitute(X))
    out
}
Browse[2]> 
debug at #2: out <- identical(quote(X), substitute(X))
Browse[2]> str(quote(X))
 symbol X
Browse[2]> str(substitute(X))
 symbol df1
Browse[2]> Q

Now we can see that what happened is precisely what the documentation said would happen (Ha! So obvious! ;) )

X is a formal argument, or a promise, which according to R is not the same thing as df1. For most people writing functions, they are effectively the same, but the internal implementation disagrees. X is a promise object, and substitute replaces the symbol X with the one that it "points to", namely df1. This is what the docs mean by the "expression slot of the promise"; that's what R sees when in the X = df1 part of the function call.

To round things out, try to guess what will happen in this case:

is.identical <- function(X){
    out <- identical(quote(A), substitute(A))
    out
}

is.identical(X = df1)

(Hint: now A is not a "bound symbol in the environment".)

A final example illustrating more directly the final case in the docs with the confusing exception:

#Ordinary variable, but in .GlobalEnv
> a <- 2
> substitute(a)
a

#Ordinary variable, but NOT in .GlobalEnv
> e <- new.env()
> e$a <- 2
> substitute(a,env = e)
[1] 2

Upvotes: 4

Related Questions