complex column selection in dplyr group_by

Question

I would like to use, within a group_by call, dplyr's column selectors like starts_with(), ends_with(), matches(), ..., or even the syntax -colName.

(Silly) example of the syntax I am after:

library("dplyr")

# I would like to do something like this
mtcars %>% 
   group_by(matches("a")) %>%
   summarise(mpg=mean(mpg))
# but I get a "wrong result size" error

I was hoping it would work, by analogy with:

mtcars %>% select(matches("a"))

which here would select columns drat, am, gear, carb

To be crystal clear: I want to use matches("a") (or equivalent) to achieve the same output as:

mtcars %>% 
group_by(drat, am, gear, carb) %>%
summarise(mpg=mean(mpg))

I am only interested in answers using dplyr. Thanks!

The current answer, while good, only allows selecting columns with a regex.

I am still looking for a more global answer that would allow the use of the full range of dplyr's selection syntax. Of course I can massage any regex to select what I want, but I wish I had something which integrates nicer with dplyr (especially to use the -colName syntax). I am going to leave this opened for a while.

asachet · Accepted Answer

group_by_at was added to dplyr some time in 2017 and does just that.

mtcars %>% 
   group_by_at(matches("a")) %>%
   summarise(mpg=mean(mpg))

complex column selection in dplyr group_by

Answers (2)

Related Questions