Reputation: 7928
I am looking for a simple way to rotate dplyr's tibble
summary.
Say I am doing something like this,
# install.packages(c("dplyr"), dependencies = TRUE)
library(dplyr)
mtcars %>%
group_by(am) %>%
summarise(
n = n(),
Mean_disp = mean(disp),
Mean_hp = mean(hp),
Mean_qsec = mean(qsec),
Mean_drat = mean(drat)
)
#> # A tibble: 2 x 6
#> am n Mean_disp Mean_hp Mean_qsec Mean_drat
#> <dbl> <int> <dbl> <dbl> <dbl> <dbl>
#> 1 0 19 290.3789 160.2632 18.18316 3.286316
#> 2 1 13 143.5308 126.8462 17.36000 4.050000
But, what I would like is to get an output more or less like this,
#> # A tibble: 5 x 2
#> am <dbl> 0 1
#> n <int> 19 13
#> Mean_disp <dbl> 290.3789 143.5308
#> Mean_hp <dbl> 160.2631 126.8462
#> Mean_qsec <dbl> 18.183158 17.36000
#> Mean_drat <dbl> 3.286316 4.050000
I realize I can use t()
, but that transform the tibble to a list and mess up the formatting.
Upvotes: 2
Views: 2218
Reputation: 56219
Maybe gather then spread again:
library(dplyr)
library(tidyr)
mtcars %>%
group_by(am) %>%
summarise(
n = n(),
Mean_disp = mean(disp),
Mean_hp = mean(hp),
Mean_qsec = mean(qsec),
Mean_drat = mean(drat)) %>%
gather(key = key, value = value, -am) %>%
spread(key = am, value = value)
# # A tibble: 5 x 3
# key `0` `1`
# * <chr> <dbl> <dbl>
# 1 Mean_disp 290.378947 143.5308
# 2 Mean_drat 3.286316 4.0500
# 3 Mean_hp 160.263158 126.8462
# 4 Mean_qsec 18.183158 17.3600
# 5 n 19.000000 13.0000
Another option, gather before group_by, then get mean for all selected columns, then spread again (but not sure how to add n()
):
mtcars %>%
select(am, disp, hp, qsec, drat) %>%
gather(key = key, value = value, -am) %>%
group_by(am, key) %>%
summarise(myMean = mean(value)) %>%
spread(key = am, value = myMean)
# # A tibble: 4 x 3
# key `0` `1`
# * <chr> <dbl> <dbl>
# 1 disp 290.378947 143.5308
# 2 drat 3.286316 4.0500
# 3 hp 160.263158 126.8462
# 4 qsec 18.183158 17.3600
Upvotes: 6
Reputation: 749
new_tibble <- as.data.frame(t(mt_cars_df)) %>%
as_tibble()
new_tibble$name <- names(mt_cars_df) %>%
select(name, V1, V2)
according to this, row.names are deprecated in tibbles, so adding them as a key column would the logical way of handling the situation. This leaves a tibble that requires reordering the columns.
Upvotes: 0