JD Long
JD Long

Reputation: 60746

Programmatically dropping a `group_by` field in dplyr

I'm writing functions that take in a data.frame and then do some operations. I need to add and subtract items from the group_by criteria in order to get where I want to go.

If I want to add a group_by criteria to a df, that's pretty easy:

library(tidyverse)
set.seed(42)
n <- 10
input <- data.frame(a = 'a', 
                    b = 'b' , 
                    vals = 1
)

input %>%
  group_by(a) -> 
grouped 

grouped
#> # A tibble: 1 x 3
#> # Groups:   a [1]
#>   a     b      vals
#>   <fct> <fct> <dbl>
#> 1 a     b        1.

## add a group:
grouped %>% 
  group_by(b, add=TRUE)
#> # A tibble: 1 x 3
#> # Groups:   a, b [1]
#>   a     b      vals
#>   <fct> <fct> <dbl>
#> 1 a     b        1.

## drop a group?

But how do I programmatically drop the grouping by b which I added, yet keep all other groupings the same?

Upvotes: 11

Views: 1699

Answers (3)

eipi10
eipi10

Reputation: 93811

Here's an approach that uses tidyeval so that bare column names can be used as the function arguments. I'm not sure if it makes sense to convert the bare column names to text (as I've done below) or if there's a more elegant way to work directly with the bare column names.

drop_groups = function(data, ...) {

  groups = map_chr(groups(data), rlang::quo_text)
  drop = map_chr(quos(...), rlang::quo_text)

  if(any(!drop %in% groups)) {
    warning(paste("Input data frame is not grouped by the following groups:", 
                  paste(drop[!drop %in% groups], collapse=", ")))
  }

  data %>% group_by_at(setdiff(groups, drop))

}

d = mtcars %>% group_by(cyl, vs, am)

groups(d %>% drop_groups(vs, cyl))
[[1]]
am
groups(d %>% drop_groups(a, vs, b, c))
[[1]]
cyl

[[2]]
am

Warning message:
In drop_groups(., a, vs, b, c) :
  Input data frame is not grouped by the following groups: a, b, c

UPDATE: The approach below works directly with quosured column names, without converting them to strings. I'm not sure which approach is "preferred" in the tidyeval paradigm, or whether there is yet another, more desirable method.

drop_groups2 = function(data, ...) {

  groups = map(groups(data), quo)
  drop = quos(...)

  if(any(!drop %in% groups)) {
    warning(paste("Input data frame is not grouped by the following groups:", 
                  paste(drop[!drop %in% groups], collapse=", ")))
  }

  data %>% group_by(!!!setdiff(groups, drop))

}

Upvotes: 11

IceCreamToucan
IceCreamToucan

Reputation: 28685

Function to remove groups by column name

drop_groups_at <- function(df, vars){
  df %>% 
    group_by_at(setdiff(group_vars(.), vars))
}


input %>%
  group_by(a, b) %>% 
  drop_groups_at('b') %>% 
  group_vars

# [1] "a"

Upvotes: 8

joran
joran

Reputation: 173577

Maybe something like this to remove grouping variables from the end of the list back:

grouped %>% 
 group_by(b, add=TRUE) -> grouped
grouped %>% group_by_at(.vars = group_vars(.)[-2])

or use head or tail or something on the output from group_vars for more control.

It would be interesting to have this sort of utility function available more generally:

peel_groups <- function(.data,n){
  .data %>%
    group_by_at(.vars = head(group_vars(.data),-n))
}

A more thought out version would likely include more careful checks on n being out of bounds.

Upvotes: 10

Related Questions