Matifou
Matifou

Reputation: 8890

R rlang: handle NULL arguments?

I want to use an optional argument that has default NULL value to a dplyr function (say count()). If I use the standard procedure with !!enquo(), I get the error message: error: Column NULL is unknown.

Interestingly, rlang/tidyverse allows for missing values, so one trick could be to convert to missing if NULL, but seems quite dirty (esp. if I want to use facet_grid after, that accepts NULL but not missing).

library(tidyverse)
df <- tibble(a = sample(LETTERS[1:2], 100, replace = TRUE), 
             b = sample(LETTERS[3:4], 100, replace = TRUE), 
             value = rnorm(100,5,1))

f2 <- function(df, group_var1=a,  group_var2=NULL, group_var3) {
  res <- df %>%
    count({{group_var1}}, {{group_var2}}, {{group_var3}})

  print(res)
  ggplot(aes(x=a, y=n), data = res)+
    geom_col() +
    facet_grid(row= enquo(group_var2))
}

f2(df, group_var1 = a, group_var2=b)
#> # A tibble: 4 x 3
#>   a     b         n
#>   <chr> <chr> <int>
#> 1 A     C        26
#> 2 A     D        29
#> 3 B     C        16
#> 4 B     D        29


f2(df, group_var1 = a)
#> Error: Column `NULL` is unknown

Created on 2019-08-04 by the reprex package (v0.3.0)

Upvotes: 2

Views: 593

Answers (1)

paqmo
paqmo

Reputation: 3729

Neither group_by nor count will accept NULL values. So you have to first create a quosures object using enquos and subset the NULL values. Since count is just a wrapper for tally and group_by, we can just group and count by hand using the group_by_at scoped version of group_by.

f2 <- function(df, group_var1=a,  group_var2=NULL, group_var3) {

grps <- enquos(a = group_var1, b = group_var2, c = group_var3, .ignore_empty = "all")

  # this removes the NULL values

  grps <- grps[map_lgl(grps, ~ !quo_is_null(.))]

  res <- df %>%
    group_by_at(grps) %>% 
    tally() %>% 
    ungroup()

  print(res) 
}

This does a fine job creating the res data frame:

> f2(df, group_var1 = a, group_var2=b)
# A tibble: 4 x 3
  a     b         n
  <chr> <chr> <int>
1 A     C        20
2 A     D        30
3 B     C        22
4 B     D        28
> f2(df, group_var1 = a)
# A tibble: 2 x 2
  a         n
  <chr> <int>
1 A        50
2 B        50

However, we encounter problems again when trying to create the plot. enquo creates a quoted object, so NULL becomes "NULL" (more accurately `NULL`), so ggplot does not know how to handle it. So I think a conditional statement is the way to go:

f2 <- function(df, group_var1=a,  group_var2=NULL, group_var3) {

  grps <- enquos(a = group_var1, b = group_var2, c = group_var3, .ignore_empty = "all")

  grps <- grps[map_lgl(grps, ~ !quo_is_null(.))]

  res <- df %>%
    group_by_at(grps) %>% 
    tally() %>% 
    ungroup()

  print(res)

  if (quo_is_null(enquo(group_var2))) {
    ggplot(aes(x=a, y=n), data = res)+
      geom_col()
  } else(
    ggplot(aes(x=a, y=n), data = res)+
      geom_col() +
      facet_grid(row= enquo(group_var2))
  )

}

Update based on Matifo's comments:

library(tidyverse)
library(rlang)
df <- tibble(a = sample(LETTERS[1:2], 100, replace = TRUE), 
             b = sample(LETTERS[3:4], 100, replace = TRUE), 
             value = rnorm(100,5,1))

f2 <- function(df, group_var1=a,  group_var2=NULL, group_var3) {

  grps <- enquos(a = group_var1, b = group_var2, c = group_var3, .ignore_empty = "all")
  grps <- grps[map_lgl(grps, ~ !quo_is_null(.))]

  res <- df %>%
    count(!!!grps) 

  print(res)

  ggplot(aes(x=a, y=n), data = res)+
    geom_col() +
    facet_grid(row= enquos(group_var2))
}

f2(df, group_var1 = a, group_var2=b)
#> # A tibble: 4 x 3
#>   a     b         n
#>   <chr> <chr> <int>
#> 1 A     C        29
#> 2 A     D        33
#> 3 B     C        18
#> 4 B     D        20

f2(df, group_var1 = a)
#> # A tibble: 2 x 2
#>   a         n
#>   <chr> <int>
#> 1 A        62
#> 2 B        38

Upvotes: 1

Related Questions