Reputation: 11793
I want to split the data into separate group and look at it.
mtcars %>% group_by(cyl) %>% select(mpg, hp)
The output:
# A tibble: 32 x 3
# Groups: cyl [3]
cyl mpg hp
* <dbl> <dbl> <dbl>
1 6 21.0 110
2 6 21.0 110
3 4 22.8 93
4 6 21.4 110
5 8 18.7 175
6 6 18.1 105
7 8 14.3 245
8 4 24.4 62
9 4 22.8 95
10 6 19.2 123
# ... with 22 more rows
The grouping variable is still a column in the dataframe. Is there any way to make the group variable a sort of group name (one entry per group), using dplyr? Something like below. It's much easier to visualize each group this way.
4 21.0 110
22.8 93
21.4 110
6 18.7 175
18.1 105
8 14.3 245
........
Upvotes: 2
Views: 2040
Reputation: 948
mtcars_df <- as.data.frame(mtcars %>% group_by(cyl) %>% select(mpg, hp) %>% arrange(cyl) )
mtcars_df$group[!duplicated(mtcars_df$cyl)] <- unique(mtcars_df$cyl)
You can then replace NA's with "" as you please.
Upvotes: 0
Reputation: 66819
Is there any way to make the group variable a sort of group name (one entry per group), using dplyr? [...] It's much easier to visualize each group this way.
You can create a custom function for printing and call it with magrittr's %T>%
for browsing.
Here's an example using data.table/dtplyr since it has the nice keep.by
argument:
library(magrittr)
library(data.table)
library(dtplyr)
library(dplyr)
print_by0 = function(x){
gvars = as.character(groups(x))
if (length(gvars)) x %>% split(by = gvars, keep.by = FALSE) %>% print
else print(x)
}
# or ... better?
print_by = function(x){
gvars = as.character(groups(x))
ovars = setdiff(names(x), gvars)
y = copy(x)[, .g := replace(rep("", .N), 1L, paste(.BY, collapse = "; ")), keyby=gvars]
y[, (gvars) := NULL ]
setcolorder(y, c(".g", ovars))
setnames(y, ".g", sprintf("GRP: {%s}", paste(gvars, collapse = "; ")))
print(data.table(y), nrow=Inf)
}
Usage. Notice that res
still has the correct structure, since %T>%
leaves it unaltered.
> DT = data.table(mtcars)
> res = DT %>% group_by(am, cyl) %>% select(mpg, hp) %T>% print_by
GRP: {am; cyl} mpg hp
1: 0; 4 24.4 62
2: 22.8 95
3: 21.5 97
4: 0; 6 21.4 110
5: 18.1 105
6: 19.2 123
7: 17.8 123
8: 0; 8 18.7 175
9: 14.3 245
10: 16.4 180
11: 17.3 180
12: 15.2 180
13: 10.4 205
14: 10.4 215
15: 14.7 230
16: 15.5 150
17: 15.2 150
18: 13.3 245
19: 19.2 175
20: 1; 4 22.8 93
21: 32.4 66
22: 30.4 52
23: 33.9 65
24: 27.3 66
25: 26.0 91
26: 30.4 113
27: 21.4 109
28: 1; 6 21.0 110
29: 21.0 110
30: 19.7 175
31: 1; 8 15.8 264
32: 15.0 335
GRP: {am; cyl} mpg hp
> res
Source: local data table [32 x 4]
Groups: am, cyl
# A tibble: 32 x 4
am cyl mpg hp
<dbl> <dbl> <dbl> <dbl>
1 1 6 21.0 110
2 1 6 21.0 110
3 1 4 22.8 93
4 0 6 21.4 110
5 0 8 18.7 175
6 0 6 18.1 105
7 0 8 14.3 245
8 0 4 24.4 62
9 0 4 22.8 95
10 0 6 19.2 123
# ... with 22 more rows
A dplyr guru could translate print_by
to work with tibbles, but ... I'll leave that as an exercise.
Upvotes: 1
Reputation: 887118
We can use replace
to change the values
library(dplyr)
mtcars %>%
group_by(cyl) %>%
select(mpg, hp) %>%
arrange(cyl) %>%
mutate(cyl1 = replace(cyl, row_number()>1, "")) %>%
ungroup() %>%
select(-cyl) %>%
rename(cyl=cyl1)
Upvotes: 1