any willow brook
any willow brook

Reputation: 68

Evaluating ... when other function arguments are NULL by default

I would like to provide a user-facing function that allows arbitrary grouping variables to be passed to a summary function, with the option of specifying additional arguments for filtering, but which are NULL by default (and thus unevaluated).

I understand why the following example should fail (because it is ambiguous where homeworld belongs and the other arg takes precedence), but I'm unsure what is the best way to pass dots appropriately in this situation. Ideally the result of the second and third calls to fun below would return the same results.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
fun <- function(.df, .species = NULL, ...) {

  .group_vars <- rlang::ensyms(...)

  if (!is.null(.species)) {
    .df <- .df %>%
      dplyr::filter(.data[["species"]] %in% .species)  
  }

  .df %>%
    dplyr::group_by(!!!.group_vars) %>%
    dplyr::summarize(
      ht = mean(.data[["height"]], na.rm = TRUE),
      .groups = "drop"
    )

}

fun(.df = starwars, .species = c("Human", "Droid"), species, homeworld)
#> # A tibble: 19 x 3
#>    species homeworld       ht
#>    <chr>   <chr>        <dbl>
#>  1 Droid   Naboo          96 
#>  2 Droid   Tatooine      132 
#>  3 Droid   <NA>          148 
#>  4 Human   Alderaan      176.
#>  5 Human   Bespin        175 
#>  6 Human   Bestine IV    180 
#>  7 Human   Chandrila     150 
#>  8 Human   Concord Dawn  183 
#>  9 Human   Corellia      175 
#> 10 Human   Coruscant     168.
#> 11 Human   Eriadu        180 
#> 12 Human   Haruun Kal    188 
#> 13 Human   Kamino        183 
#> 14 Human   Naboo         168.
#> 15 Human   Serenno       193 
#> 16 Human   Socorro       177 
#> 17 Human   Stewjon       182 
#> 18 Human   Tatooine      179.
#> 19 Human   <NA>          193
fun(.df = starwars, .species = NULL, homeworld)
#> # A tibble: 49 x 2
#>    homeworld         ht
#>    <chr>          <dbl>
#>  1 Alderaan        176.
#>  2 Aleen Minor      79 
#>  3 Bespin          175 
#>  4 Bestine IV      180 
#>  5 Cato Neimoidia  191 
#>  6 Cerea           198 
#>  7 Champala        196 
#>  8 Chandrila       150 
#>  9 Concord Dawn    183 
#> 10 Corellia        175 
#> # … with 39 more rows
fun(.df = starwars, homeworld)
#> Error in fun(.df = starwars, homeworld): object 'homeworld' not found


<sup>Created on 2020-06-15 by the [reprex package](https://reprex.tidyverse.org) (v0.3.0)</sup>

I know that I can achieve the desired result by:

fun <- function(.df, .species = NULL, .groups = NULL) {

  .group_vars <- rlang::syms(purrr::map(.groups, rlang::as_string))

...

}

But I am looking for a solution using ..., or that allows the user to pass either strings or symbols to .groups, e.g. .groups = c(species, homeworld) or .groups = c("species", "homeworld").

Upvotes: 1

Views: 133

Answers (1)

Dason
Dason

Reputation: 61953

You could move the parameters so that .species comes after the dots.

fun <- function(.df, ..., .species = NULL) {

    .group_vars <- rlang::ensyms(...)

    if (!is.null(.species)) {
        .df <- .df %>%
            dplyr::filter(.data[["species"]] %in% .species)  
    }

    .df %>%
        dplyr::group_by(!!!.group_vars) %>%
        dplyr::summarize(
            ht = mean(.data[["height"]], na.rm = TRUE),
            .groups = "drop"
        )

}

fun(.df = starwars, homeworld)

which gives

> fun(.df = starwars, homeworld)
# A tibble: 49 x 3
   homeworld         ht .groups
   <chr>          <dbl> <chr>  
 1 NA              139. drop   
 2 Alderaan        176. drop   
 3 Aleen Minor      79  drop   
 4 Bespin          175  drop   
 5 Bestine IV      180  drop   
 6 Cato Neimoidia  191  drop   
 7 Cerea           198  drop   
 8 Champala        196  drop   
 9 Chandrila       150  drop   
10 Concord Dawn    183  drop   
# ... with 39 more rows

which is what you wanted to happen. The other examples still work as well.

Upvotes: 2

Related Questions