Aegis
Aegis

Reputation: 200

Inconsistent assign() behavior in simple piping with Tidyverse

By simply changing the argument order in the join step, I can get the code below to run. I just installed the most recent version of Tidyverse as of this post (1.3.1), and I'm using R version 4.1.1 (2021-08-10), "Kick Things". Please end my madness:

Updates:

library(dplyr)

#Doesn't run
if(exists("test")) rm("test")
iris%>%
  assign(x = "test",value = .,envir = .GlobalEnv)%>%
  left_join(x = test,y =. ,by="Species")

#Runs
if(exists("test")) rm("test")
iris%>%
  assign(x = "test",value = .,envir = .GlobalEnv)%>%
  left_join(x = .,y =test ,by="Species")

Upvotes: 3

Views: 210

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 174478

The pipe makes things a little more confusing here, but we get the same effect if we write the same code as nested functions:

#Doesn't run
if(exists("test")) rm("test")
left_join(x = test, y = assign("test", iris, envir = .GlobalEnv), by = "Species")

#Runs
if(exists("test")) rm("test")
left_join(x = assign("test", iris, envir = .GlobalEnv), y = test, by = "Species")

When you see it written out like this, it now makes sense why the first version doesn't run: you are calling left_join on a non-existent object; since left_join is an S3 generic, it only evaluates x to determine method dispatch, and passes all the other parameters as unevaluated promises to left_join.data.frame. Since y has not been evaluated, test is not written, so we get a test not found error.

In the second version, the y parameter isn't evaluated until it is required inside left_join.data.frame, and by the time it is evaluated, test has already been written.

So this odd behaviour is a result of lazy evaluation.

Upvotes: 5

Related Questions