xiahfyj
xiahfyj

Reputation: 101

How to group by multiple values in a function with dplyr

I wonder how to modify below code

xxx<-function(df,groupbys){
  groupbys<-enquo(groupbys)
    df%>%group_by_(groupbys)%>%summarise(count=n())
  }

zzz<-xxx(iris,Species)

to have the option to feed in either one column or more than one column to group by? For example, goup_by_ both Speciesand Petal.Length with iris dataset.

Upvotes: 0

Views: 416

Answers (3)

Ronak Shah
Ronak Shah

Reputation: 389235

Here are two approaches to the problem. If you want to pass column name as unquoted variables, you can use ... and use it in count instead of group_by + summarise.

xxx<-function(df,...){
   df %>% count(...)
}

xxx(mtcars, cyl)

# A tibble: 3 x 2
#    cyl     n
#  <dbl> <int>
#1     4    11
#2     6     7
#3     8    14

xxx(mtcars, cyl, am)

# A tibble: 6 x 3
#    cyl    am     n
#  <dbl> <dbl> <int>
#1     4     0     3
#2     4     1     8
#3     6     0     4
#4     6     1     3
#5     8     0    12
#6     8     1     2

Second approach if you want to pass column name as quoted variable (strings), you can use group_by_at which accepts string inputs.

xxx<-function(df,groupbys){
   df %>% group_by_at(groupbys) %>% summarise(n = n())
}

xxx(mtcars, c("cyl", "am"))

Upvotes: 0

Onyambu
Onyambu

Reputation: 79338

This is a point whereby you just need to use the .dots argument in the groupby function. Just ensure the groupbys is a character. ie

xxx<-function(df,groupbys){
  df%>%group_by(.dots = groupbys)%>%summarise(count=n())
}


xxx(iris,"Species")
# A tibble: 3 x 2
  Species    count
  <fct>      <int>
1 setosa        50
2 versicolor    50
3 virginica     50

xxx(iris,c("Species","Petal.Length"))
# A tibble: 48 x 3
# Groups:   Species [3]
   Species    Petal.Length count
   <fct>             <dbl> <int>
 1 setosa              1       1
 2 setosa              1.1     1
 3 setosa              1.2     2
 4 setosa              1.3     7
 5 setosa              1.4    13
 6 setosa              1.5    13
 7 setosa              1.6     7
 8 setosa              1.7     4
 9 setosa              1.9     2
10 versicolor          3       1

Upvotes: 0

r2evans
r2evans

Reputation: 160827

When using enquo (single argument) or enquos (multiple), you should use the !! and !!! operators, respectively.

xxx <- function(df, ...) {
  grps <- enquos(...)
  df %>%
    group_by(!!!grps) %>%
    tally() %>%
    ungroup()
}
mtcars %>% xxx(cyl, am)
# # A tibble: 6 x 3
#     cyl    am     n
#   <dbl> <dbl> <int>
# 1     4     0     3
# 2     4     1     8
# 3     6     0     4
# 4     6     1     3
# 5     8     0    12
# 6     8     1     2

or if you want to keep a single argument in the function formals for one or more column names, I think you'll need to use vars() in the call. (Perhaps there's another way suggested in the Programming with dplyr vignette.)

xxx <- function(df, groups) {
  df %>%
    group_by(!!!groups) %>%
    tally() %>%
    ungroup()
}
xxx(mtcars, vars(cyl, am))

Upvotes: 2

Related Questions