Megatron
Megatron

Reputation: 17099

Are there limits to tidy evaluation scope?

I'm trying to use tidy evaluation, as defined in dplyr 0.7.0.

However, during a function call within mutate() I am getting an error. It seems that the variables are not being evaluated as I expect.

library(dplyr)
library(tibble)
library(tidyr)

myCor <- function(col1, col2) {
  col1 <- enquo(col1)
  col2 <- enquo(col2)

  mtcars %>%
    rownames_to_column("vehicle") %>%
    select(vehicle, !!col1, !!col2) %>%
    mutate(correlation=cor(!!col1, !!col2))
}

myCor("mpg", "disp")
# Error in mutate_impl(.data, dots) : 
#   Evaluation error: 'x' must be numeric.

Instead, I have to use this non-tidy eval syntax to get the desired output.

myCor <- function(col1, col2) {
  col1_tidy <- enquo(col1)
  col2_tidy <- enquo(col2)

  mtcars %>%
    rownames_to_column("vehicle") %>%
    select(vehicle, !!col1_tidy, !!col2_tidy) %>%
    mutate(correlation=cor(eval(parse(text=col1)), eval(parse(text=col2))))
}

myCor("mpg", "disp")
# vehicle  mpg  disp correlation
# 1            Mazda RX4 21.0 160.0  -0.8475514
# 2        Mazda RX4 Wag 21.0 160.0  -0.8475514
# 3           Datsun 710 22.8 108.0  -0.8475514
# 4       Hornet 4 Drive 21.4 258.0  -0.8475514
# ...

Is there a way to use tidy evaluation throughout this example?

Upvotes: 2

Views: 77

Answers (2)

kath
kath

Reputation: 7734

From the vignette Programming with dplyr:

Most dplyr arguments are not referentially transparent. That means you can’t replace a value with a seemingly equivalent object that you’ve defined elsewhere.

Therefore you need to pass the unquoted column names to the function, as then with enquo the environment is captured properly and the !! returns the unquoted column names as mutate expects it.

To turn a normal mutate-call into a function with non-standard evaluation it might be more intuitive to start like this.
First write down the call as you would do without the function:

mtcars %>%
    rownames_to_column("vehicle") %>%
    select(vehicle, mpg, disp) %>%
    mutate(correlation = cor(mpg, disp))

This works (and would raise an error with quoted mpg and disp!).
Now pull the variables, you'd like to vary in front of the call and substitute them:

col1 <- quo(mpg)
col2 <- quo(disp)

mtcars %>%
  rownames_to_column("vehicle") %>%
  select(vehicle, !!col1, !!col2) %>%
  mutate(correlation=cor(!!col1, !!col2))

As this is outside of a function we have to use quo here, but in the last step, when we wrap it in a function we use enquo.

myCor <- function(var1, var2) {
  col1 <- enquo(var1)
  col2 <- enquo(var2)

  mtcars %>%
    rownames_to_column("vehicle") %>%
    select(vehicle, !!col1, !!col2) %>%
    mutate(correlation=cor(!!col1, !!col2))
}

I used different names for the function arguments and then the "quoted" object (with enquo) to make the distinction clearer, but of course it works with col1 and col2 as well.

Upvotes: 1

CJ Yetman
CJ Yetman

Reputation: 8848

Use ensym instead of enquo if you want to pass strings as your arguments...

library(dplyr)
library(rlang)
library(tibble)

myCor <- function(col1, col2) {
  col1 <- ensym(col1)
  col2 <- ensym(col2)

  mtcars %>%
    rownames_to_column("vehicle") %>%
    select(vehicle, !!col1, !!col2) %>%
    mutate(correlation=cor(!!col1, !!col2))
}

# both of these will work now
myCor("mpg", "disp")
myCor(mpg, disp)

Upvotes: 1

Related Questions