Marcus Lehr
Marcus Lehr

Reputation: 57

How to use dplyr::group_by with multiple groups when programming

Okay so it's one of those days where a previously working piece of code suddenly breaks. Here's a reprex of the code in question:

test = data.frame(factor1 = sample(1:5, 10, replace=T),
                  factor2 = sample(letters[1:5], 10, replace=T),
                  variable = sample(100:200, 10))

group_vars = c('factor1','factor2') %>% paste(., collapse = ',')

> test %>% dplyr::group_by_(group_vars)
Error in parse(text = x) : <text>:1:8: unexpected ','
1: factor1,
           ^

Now I sweaaaar this worked until today. Of course dplyr is trying to do away with the 'x_' functions anyway, but I've tried to plug everything I can think of into group_by()- using combinations of !!, !!!, sym(), quo(), enquo(), etc and can't figure it out. I've tried not pasting the column names together and AT BEST it simply takes the first one and ignores everything else. Most commonly I get the following error message:

Error: Column <chr> must be length 10 (the number of rows) or one, not 2

I've also read over Hadley's dplyr programming guide (https://dplyr.tidyverse.org/articles/programming.html), WHICH SEEMS to cover the issue, except that I'm generating the column names internally and not accepting them as arguments to the function. Has anyone come across this or understand quoting well enough to know a solution to this?

Also, to be clear, this works when only using a single grouping variable. The problem is with multiple groups.

Thanks!

Upvotes: 2

Views: 779

Answers (1)

akrun
akrun

Reputation: 886938

Instead of pasteing and using group_by_ (deprecated - but it would not work because it is expecting NSE), we can directly use the vector in group_by_at

library(dplyr)
group_vars <- c('factor1','factor2')
test %>%
     group_by_at(group_vars)
# A tibble: 10 x 3
# Groups:   factor1, factor2 [10]
#   factor1 factor2 variable
#     <int> <fct>      <int>
# 1       1 d            145
# 2       5 e            119
# 3       4 a            181
# 4       3 e            155
# 5       3 d            164
# 6       3 b            135
# 7       4 e            137
# 8       4 d            197
# 9       2 d            142
#10       2 c            110

Or another option is to convert to symbols (syms from rlang) and evaluate (!!!) within group_by

test %>%
      group_by(!!! rlang::syms(group_vars))

If we go by the route of paste, then one option is parse_expr (from rlang)

group_vars = c('factor1','factor2') %>% paste(., collapse = ';')
test %>%
      group_by(!!! rlang::parse_exprs(group_vars))
# A tibble: 10 x 3
# Groups:   factor1, factor2 [10]
#   factor1 factor2 variable
#     <int> <fct>      <int>
# 1       1 d            145
# 2       5 e            119
# 3       4 a            181
# 4       3 e            155
# 5       3 d            164
# 6       3 b            135
# 7       4 e            137
# 8       4 d            197
# 9       2 d            142
#10       2 c            110

Upvotes: 2

Related Questions