Reputation: 479
I have a dataframe
where one of the columns is a list containing a matrix for each row, defining a transition matrix for that observation.
library(tidyverse)
m <- matrix(1:4, ncol = 2)
d <- data_frame(g = c('a', 'a', 'b', 'b', 'b', 'c'),
m = rep(list(m), 6))
This looks like:
# A tibble: 6 × 2
g m
<chr> <list>
1 a <int [2 × 2]>
2 a <int [2 × 2]>
3 b <int [2 × 2]>
4 b <int [2 × 2]>
5 b <int [2 × 2]>
6 c <int [2 × 2]>
I want to get out list of two matrices, a
and b
that are the sum of all the matrices for each respective grouping factor. I need this method to generalize to an arbitrary number of groups, because I will not know the number of grouping factors in advance.
I have tried by_slice
and do
, but all I can manage to output is a sum of all matrices, or a sum of either the a
or b
matrices alone -- not bound in a single group.
Upvotes: 3
Views: 395
Reputation: 76
Another way using group_by
, summarise
, and reduce
:
m_sum <- function(l) {
reduce(l, `+`) %>% list()
}
group_by(d, g) %>%
summarise(m_sum = m_sum(m)) %>%
select(m_sum) %>%
unlist(recursive = FALSE)
Upvotes: 3
Reputation: 78630
You can do this by nesting the matrices within groups (with tidyr's nest
), which creates a list column that contains lists of matrices. You can then use purrr's map
and reduce
to sum up the matrices within each group's list:
results <- d %>%
nest(-g) %>%
mutate(summed = map(data, ~ reduce(.$m, `+`)))
Results:
# A tibble: 3 × 3
g data summed
<chr> <list> <list>
1 a <tibble [2 × 1]> <int [2 × 2]>
2 b <tibble [3 × 1]> <int [2 × 2]>
3 c <tibble [1 × 1]> <int [2 × 2]>
The summed
column will have the matrices added up within each group.
If you wanted to turn this into a named list with items a/b/c of matrices, you could do:
lst <- results$summed
names(lst) <- results$g
lst
or alternatively:
results %>%
select(-data) %>%
spread(g, summed)
Upvotes: 6