filcfig
filcfig

Reputation: 75

Tidyverse, Rlang and tidyeval: Bang bang (!!) failing inside function, but it appears to work without quotation

I am running a function on a long database (full_database) with two major groups where I need to perform various linear models on multiple subsets, per group.

Then, I extract the R^2, the adjusted R^2 and the p.value into a dataframe where each row corresponds to a single comparison. Since there are 30 different cases, I have another tibble which lists all possibilities (possibilities) where the arguments for the function lie.

The script for the original function is:

database_correlation <-  function(id, group) {

    require(dplyr)
    require(tidyr)
    require(rlang)

    id_name <- quo_name(id)
    id_var <- enquo(id)
    group_name <- quo_name(group)
    group_var <- enquo(group)

    corr_db <- full_database %>%
      filter(numid==!!id_name) %>%
      filter(major_group==!!group_name) %>%
      droplevels()

    correlation <- summary(lm(yvar~xvar, corr_db))

    id.x <- as.character(!!id_var) #Gives out an error: "invalid argument type"
    group.x <- as.character(!!group_var) #Gives out an error: "invalid argument type"
    r_squared <- correlation$r.squared
    r_squared_adj <- correlation$adj.r.squared
    p_value <- correlation$coefficients[2,4]

    data.frame(id.x, group.x, r_squared, r_squared_adj, p_value, stringsAsFactors=FALSE)
  }

I then run the function with:

correlation_all <- lapply(seq(nrow(possibilities)), function(index) {
    current <- possibilities[index,]
    with(current, database_correlation(id, database))
  }) %>%
    bind_rows()

I have commented the part where I get an error (id.x and group.x assignment) and I've tried multiple alternatives (I will use id.x as an example):

  1. id_var <- enquo(id) & id.x <- print(!!id_var)
  2. id_var <- sym(id) & id.x <- as.character(!!id_var)
  3. id_var <- sym(id) & id.x <- print(!!id_var)
  4. No id_var & id.x <- !!id_name
  5. No id_var & id.x <- id_name

The last option (in bold), works even though it has no unquotation and the same is true if I remove the bang bang (!!) when filtering the full_database, by using filter(numid==id_name) directly but I just can't understand why. By testing with TRUE and FALSE, R might be interpreting bang bang as double negation and, since it's expecting a boolean, it throws out an error.

Thank you for your help!

Upvotes: 5

Views: 1098

Answers (1)

smingerson
smingerson

Reputation: 1438

Use id and group directly -- I'm presuming these are character strings which were passed in, so I don't think there's a need to coerce the quosure to a string. Additionally, !! can be used inside functions which support tidy evaluation. A simple first step in determining this is "is the function from a base R package". as.character() is, so it doesn't work.

If you are determined to convert the quosure to a string, you can use rlang::as_name() to retrieve the corresponding symbol as a string. This is the recommended way of doing so.

By testing with TRUE and FALSE, R might be interpreting bang bang as double negation and, since it's expecting a boolean, it throws out an error.

Your supposition is correct.

The last option (in bold), works even though it has no unquotation and the same is true if I remove the bang bang (!!) when filtering the full_database, by using filter(numid==id_name)

Tidy-evaluation at it's heart is to evaluate symbols in the correct environment, or at least that's my take. This filter() works because it looks for the symbol id_name, does not find it in the data (the first place it looks), then looks in the enclosing environment, finds it, and evaluates the statement.

Imagine if you had a column named id_name within the data. How would you differentiate between the data's id_name and the one in the enclosing environment. Well, if you wanted the data's value, you could use .data$id_name (another rlang construct). If you want the value outside the data instead, use !!. This tells functions which support tidy evaluation to look at the quosure. The quosure identifies which environment it was defined in. Then it evaluates that symbol in that environment, ensuring no collision with a name in the data.

Upvotes: 2

Related Questions