Reputation: 967
I'd like to make a table that looks like this
I have tibbles with each of the data points, but they're not combined.
library('dplyr')
library('ISLR')
data(Hitters)
Hitters <- na.omit(Hitters)
Q <- Hitters %>% group_by(League) %>%
dplyr::summarize(count = n(), avg_wage = sum(Salary)/n())
A <- Hitters %>% group_by(Division) %>%
dplyr::summarize(count = n(), avg_wage = sum(Salary)/n())
Z <- Hitters %>% group_by(NewLeague) %>%
dplyr::summarize(count = n(), avg_wage = sum(Salary)/n())
My goal is to stack the tibbles above each other in one output with shared "count" and "avg_wage" columns. I tried bind_rows() and ftable(), without success.
Upvotes: 0
Views: 106
Reputation: 2101
The problem is that you can't combine rows with different column names so it ends up giving you a confusing dataframe. We can instead use gather()
to create two new columns and get the proper table.
library(tidyverse)
library(ISLR)
data(Hitters)
Hitters <- na.omit(Hitters)
Q <- Hitters %>% group_by(League) %>%
dplyr::summarize(count = n(), avg_wage = sum(Salary)/n())
A <- Hitters %>% group_by(Division) %>%
dplyr::summarize(count = n(), avg_wage = sum(Salary)/n())
Z <- Hitters %>% group_by(NewLeague) %>%
dplyr::summarize(count = n(), avg_wage = sum(Salary)/n())
list(Q,A,Z) %>%
map_df(bind_rows) %>%
gather("league_type", "league_id", c(1, 4, 5)) %>%
filter(!is.na(league_id))
#> Warning: attributes are not identical across measure variables;
#> they will be dropped
#> # A tibble: 6 x 4
#> count avg_wage league_type league_id
#> <int> <dbl> <chr> <chr>
#> 1 139 542. League A
#> 2 124 529. League N
#> 3 129 624. Division E
#> 4 134 451. Division W
#> 5 141 537. NewLeague A
#> 6 122 535. NewLeague N
Created on 2019-01-21 by the reprex package (v0.2.1)
You can use spread()
to get it back to wide format, although I would advise against that. The long version will probably be easier to work with.
Upvotes: 1