Andrew Hill
Andrew Hill

Reputation: 317

Using Rlang: find the data pronoun in a set of quosures

I have a set of quosures which are being used to generate sets of summary statistics using dplyr.

I want to know which data columns are being used.

The data columns are prefixed by .data[["ColumnName"]].

So for example we have:

my_quos <- rlang::list2(
  "GenderD" = rlang::quo(length(.data[["TeamCode"]])),
  "GenderMaleN" = rlang::quo(.data[["S1IsMale"]])
)

I've started tackling this problem by using rlang::call_args() to break a command up into its components:

my_args_test <- rlang::call_args(my_quos[[1]])
str(my_args_test)
List of 1
 $ : language .data[["TeamCode"]]

The columns should all be sat as data pronouns. Is there a quick way to check if the item within the list is a data pronoun? I had tried:

is(my_args_test[[1]], "rlang_data_pronoun")

But this returns false. Checking the string as text beginning with .data[[ might be an option I guess (but I suspect that is more fallible).

Also is there a way to cleanly return the parameter passed to the data pronoun rather than parsing the string? In other words the goal is to ideally return our output to be:

c("TeamCode", "S1IsMale")

From the original my_quos.

Upvotes: 1

Views: 228

Answers (1)

Artem Sokolov
Artem Sokolov

Reputation: 13691

This can be done in two steps. First, you want to extract expressions captured by your quosures and convert them to Abstract Syntax Trees (ASTs).

## Recursively constructs Abstract Syntax Tree for a given expression
getAST <- function( ee ) { as.list(ee) %>% purrr::map_if(is.call, getAST) }

## Apply function to expressions captured by each quosure
asts <- purrr::map( my_quos, quo_get_expr ) %>% purrr::map( getAST )
str(asts)
# List of 2
#  $ GenderD    :List of 2
#   ..$ : symbol length
#   ..$ :List of 3
#   .. ..$ : symbol [[
#   .. ..$ : symbol .data
#   .. ..$ : chr "TeamCode"
#  $ GenderMaleN:List of 3
#   ..$ : symbol [[
#   ..$ : symbol .data
#   ..$ : chr "S1IsMale"

From here, we see that the pattern matching .data[["somename"]] is a length-3 list where the first entry is [[, the second entry is .data and the last entry is what you're trying to extract. Let's write a function that recognizes this pattern and returns the third element upon recognition (NOTE: this function shows how to match an item against .data pronoun, which was your other question):

## If the input matches .data[["name"]], returns "name". Otherwise, NULL
getName <- function( x )
{
  if( is.list(x) && length(x) == 3 &&          ## It's a length-3 list
      identical( x[[1]], quote(`[[`) ) &&      ##  with [[ as the first element
      identical( x[[2]], quote(.data) ) &&     ##  .data as the second element
      is.character(x[[3]]) ) return(x[[3]])    ##  and a character string as 3rd
  NULL
}

Given this function, the second step is simply to apply it recursively to your list of ASTs to extract column names used.

getNames <- function( aa ) { 
  purrr::keep(aa, is.list) %>% 
  purrr::map(getNames) %>%            ## Recurse to any list descendants
  c( getName(aa) ) %>%                ## Append self to the result
  unlist                              ## Return as character vector, not list
}

getNames(asts)
#     GenderD GenderMaleN 
#  "TeamCode"  "S1IsMale" 

Upvotes: 1

Related Questions