Reputation: 601

dplyr summary tables for list of columns

I have a data frame:

role <- c("hi", "hi", "lo", "lo", "me", "me")
a <- c("w", "r","w", "r", "w", "r")
b <- c("z", "x","z", "x", "z", "x")
c <- c("o", "o","p", "p", "t", "y")

df <- data.frame(role, a, b, c)

To get a pivot table style summary of which roles have what frequency of factors in column a I use dplyr:

summ <- df %>%
group_by(role, a) %>%
tally() %>%
spread(a, n, fill = 0)

summ

How can I automatically generate separate pivot tables for all columns (a, b and c) using one dplyr pipe?

Upvotes: 1

Answers (4)

Z.Lin

Reputation: 29085

One approach, retaining the code you used for a single pivot table:

library(dplyr); library(tidyr)

df %>%
  gather(key, value, -role) %>%
  split(.$key) %>%
  lapply(function(x){x %>% group_by(role, value) %>%
      tally() %>%
      spread(value, n, fill = 0)})

$a
# A tibble: 3 x 3
# Groups:   role [3]
   role     r     w
* <chr> <dbl> <dbl>
1    hi     1     1
2    lo     1     1
3    me     1     1

$b
# A tibble: 3 x 3
# Groups:   role [3]
   role     x     z
* <chr> <dbl> <dbl>
1    hi     1     1
2    lo     1     1
3    me     1     1

$c
# A tibble: 3 x 5
# Groups:   role [3]
   role     o     p     t     y
* <chr> <dbl> <dbl> <dbl> <dbl>
1    hi     2     0     0     0
2    lo     0     2     0     0
3    me     0     0     1     1

Upvotes: 2

lmo

Reputation: 38500

This is pretty straightforward in base R with lapply and table:

lapply(df[-1], function(x) table(df$role, x))
$a
    x
     r w
  hi 1 1
  lo 1 1
  me 1 1

$b
    x
     x z
  hi 1 1
  lo 1 1
  me 1 1

$c
    x
     o p t y
  hi 2 0 0 0
  lo 0 2 0 0
  me 0 0 1 1

This returns a list of table objects. If you want a list of data.frames, you could do

lapply(df[-1], function(x) as.data.frame.matrix(table(df$role, x)))

Upvotes: 1

akrun

Reputation: 887068

Here is an option using tidyverse

library(tidyverse)
names(df)[-1] %>% 
    map(~df %>% 
         select(.x, role) %>% 
         group_by(!!sym(.x), role) %>%
         tally() %>% 
         spread(!!sym(.x), n, fill = 0))
#    [[1]]
# A tibble: 3 x 3
#    role     r     w
#* <fctr> <dbl> <dbl>
#1     hi     1     1
#2     lo     1     1
#3     me     1     1

#[[2]]
# A tibble: 3 x 3
#    role     x     z
#* <fctr> <dbl> <dbl>
#1     hi     1     1
#2     lo     1     1
#3     me     1     1

#[[3]]
# A tibble: 3 x 5
#    role     o     p     t     y
#* <fctr> <dbl> <dbl> <dbl> <dbl>
#1     hi     2     0     0     0
#2     lo     0     2     0     0
#3     me     0     0     1     1

Upvotes: 3

Jake Kaupp

Reputation: 8072

FYI You're also using tidyr for gather, and you can use count in place of the group then tally for a more concise pipe.

If you want a single table outlining all combinations:

library(tidyr)
library(dplyr)

df %>%
  gather(variable, value, -role) %>% 
  count(role, variable, value) %>% 
  spread(value, n, fill = 0)

    role variable     o     p     r     t     w     x     y     z
* <fctr>    <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     hi        a     0     0     1     0     1     0     0     0
2     hi        b     0     0     0     0     0     1     0     1
3     hi        c     2     0     0     0     0     0     0     0
4     lo        a     0     0     1     0     1     0     0     0
5     lo        b     0     0     0     0     0     1     0     1
6     lo        c     0     2     0     0     0     0     0     0
7     me        a     0     0     1     0     1     0     0     0
8     me        b     0     0     0     0     0     1     0     1
9     me        c     0     0     0     1     0     0     1     0

Upvotes: 0

dplyr summary tables for list of columns

Answers (4)

Related Questions