zesla
zesla

Reputation: 11793

convert group variable to group name after dplyr::group_by in r

I want to split the data into separate group and look at it.

mtcars %>% group_by(cyl) %>% select(mpg, hp)

The output:

# A tibble: 32 x 3
# Groups:   cyl [3]
     cyl   mpg    hp
 * <dbl> <dbl> <dbl>
 1     6  21.0   110
 2     6  21.0   110
 3     4  22.8    93
 4     6  21.4   110
 5     8  18.7   175
 6     6  18.1   105
 7     8  14.3   245
 8     4  24.4    62
 9     4  22.8    95
10     6  19.2   123
# ... with 22 more rows

The grouping variable is still a column in the dataframe. Is there any way to make the group variable a sort of group name (one entry per group), using dplyr? Something like below. It's much easier to visualize each group this way.

      4  21.0   110
         22.8    93
         21.4   110
      6  18.7   175
         18.1   105
      8  14.3   245
      ........

Upvotes: 2

Views: 2040

Answers (3)

Brian Syzdek
Brian Syzdek

Reputation: 948

mtcars_df <- as.data.frame(mtcars %>% group_by(cyl) %>% select(mpg, hp) %>% arrange(cyl) )
mtcars_df$group[!duplicated(mtcars_df$cyl)] <- unique(mtcars_df$cyl)

You can then replace NA's with "" as you please.

Upvotes: 0

Frank
Frank

Reputation: 66819

Is there any way to make the group variable a sort of group name (one entry per group), using dplyr? [...] It's much easier to visualize each group this way.

You can create a custom function for printing and call it with magrittr's %T>% for browsing.

Here's an example using data.table/dtplyr since it has the nice keep.by argument:

library(magrittr)
library(data.table)
library(dtplyr)
library(dplyr)

print_by0 = function(x){
    gvars = as.character(groups(x))
    if (length(gvars)) x %>% split(by = gvars, keep.by = FALSE) %>% print
    else print(x)
} 

# or ... better?
print_by = function(x){
    gvars = as.character(groups(x))
    ovars = setdiff(names(x), gvars)
    y = copy(x)[, .g := replace(rep("", .N), 1L, paste(.BY, collapse = "; ")), keyby=gvars]
    y[, (gvars) := NULL ]
    setcolorder(y, c(".g", ovars))
    setnames(y, ".g", sprintf("GRP: {%s}", paste(gvars, collapse = "; ")))
    print(data.table(y), nrow=Inf)
} 

Usage. Notice that res still has the correct structure, since %T>% leaves it unaltered.

> DT = data.table(mtcars)
> res = DT %>% group_by(am, cyl) %>% select(mpg, hp) %T>% print_by
    GRP: {am; cyl}  mpg  hp
 1:           0; 4 24.4  62
 2:                22.8  95
 3:                21.5  97
 4:           0; 6 21.4 110
 5:                18.1 105
 6:                19.2 123
 7:                17.8 123
 8:           0; 8 18.7 175
 9:                14.3 245
10:                16.4 180
11:                17.3 180
12:                15.2 180
13:                10.4 205
14:                10.4 215
15:                14.7 230
16:                15.5 150
17:                15.2 150
18:                13.3 245
19:                19.2 175
20:           1; 4 22.8  93
21:                32.4  66
22:                30.4  52
23:                33.9  65
24:                27.3  66
25:                26.0  91
26:                30.4 113
27:                21.4 109
28:           1; 6 21.0 110
29:                21.0 110
30:                19.7 175
31:           1; 8 15.8 264
32:                15.0 335
    GRP: {am; cyl}  mpg  hp
> res
Source: local data table [32 x 4]
Groups: am, cyl

# A tibble: 32 x 4
      am   cyl   mpg    hp
   <dbl> <dbl> <dbl> <dbl>
 1     1     6  21.0   110
 2     1     6  21.0   110
 3     1     4  22.8    93
 4     0     6  21.4   110
 5     0     8  18.7   175
 6     0     6  18.1   105
 7     0     8  14.3   245
 8     0     4  24.4    62
 9     0     4  22.8    95
10     0     6  19.2   123
# ... with 22 more rows

A dplyr guru could translate print_by to work with tibbles, but ... I'll leave that as an exercise.

Upvotes: 1

akrun
akrun

Reputation: 887118

We can use replace to change the values

library(dplyr)
mtcars %>%
   group_by(cyl) %>%
   select(mpg, hp) %>% 
   arrange(cyl) %>%
   mutate(cyl1 = replace(cyl, row_number()>1, "")) %>%
   ungroup() %>%
   select(-cyl) %>%
   rename(cyl=cyl1)

Upvotes: 1

Related Questions