sebpardo
sebpardo

Reputation: 707

Bind vectors across lists to single list of matrices

I'd like to join multiple vectors across separate lists and output a single list of matrices. The idea is that all items of the list with the same name, for example all the a items, are joined by rows as a matrix. The added complication is that these vectors can be of different lengths, so rbind is not straightforward to implement; the missing values in the matrix can be appended with NAs.

Example

Input lists:

list1 <- list(a = 1:5, b = 6:10, c = 11:15)
list2 <- list(a = 1:4, b = 6:9, c = 11:14)
list3 <- list(a = 1:3, b = 6:8, c = 11:13)

list1
# $a
# [1] 1 2 3 4 5
# 
# $b
# [1]  6  7  8  9 10
# 
# $c
# [1] 11 12 13 14 15
# 

The desired output I'm hoping to obtain is a list with as many matrices as there are unique list items, where each matrix consists of the vectors of differing lengths bound by rows:

# $a
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    1    2    3    4    5
# [2,]    1    2    3    4   NA
# [3,]    1    2    3   NA   NA
# 
# $b
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    6    7    8    9   10
# [2,]    6    7    8    9   NA
# [3,]    6    7    8   NA   NA
# 
# $c
#      [,1] [,2] [,3] [,4] [,5]
# [1,]   11   12   13   14   15
# [2,]   11   12   13   14   NA
# [3,]   11   12   13   NA   NA

How would I go about writing a function that does this that also scales up to merging longer lists with vectors of varying lengths?

Upvotes: 8

Views: 263

Answers (5)

M--
M--

Reputation: 28825

Just a doodle of mine:

library(magrittr)
list(list1, list2, list3) %>% 
  do.call("rbind", .) %>%
  as.data.frame() %>% 
  sapply(., function(x) lapply(x, `length<-`, max(lengths(x)))) %>% 
  apply(., 2, as.list) %>% 
  lapply(., function(x) do.call(rbind, x))
# $a
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    1    2    3    4    5
# [2,]    1    2    3    4   NA
# [3,]    1    2    3   NA   NA
# 
# $b
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    6    7    8    9   10
# [2,]    6    7    8    9   NA
# [3,]    6    7    8   NA   NA
# 
# $c
#      [,1] [,2] [,3] [,4] [,5]
# [1,]   11   12   13   14   15
# [2,]   11   12   13   14   NA
# [3,]   11   12   13   NA   NA

Upvotes: 2

akrun
akrun

Reputation: 887078

One option is to transpose the list of lists, then reduce the list elements to a single dataset with cbind.fill, get the transpose (t) and assign the row names to NULL

library(tidyverse)
library(rowr)
list(list1, list2, list3) %>% 
    transpose %>% 
    map(~ reduce(.x, cbind.fill, fill = NA) %>% 
          t %>% 
         `row.names<-`(NULL))
#$a
#     [,1] [,2] [,3] [,4] [,5]
#[1,]    1    2    3    4    5
#[2,]    1    2    3    4   NA
#[3,]    1    2    3   NA   NA

#$b
#     [,1] [,2] [,3] [,4] [,5]
#[1,]    6    7    8    9   10
#[2,]    6    7    8    9   NA
#[3,]    6    7    8   NA   NA

#$c
#     [,1] [,2] [,3] [,4] [,5]
#[1,]   11   12   13   14   15
#[2,]   11   12   13   14   NA
#[3,]   11   12   13   NA   NA

Or using base R

do.call(Map, c(f = function(...) {l1 <- list(...)
   do.call(rbind, lapply(l1, `length<-`, max(lengths(l1))))},  
      mget(paste0("list", 1:3))))

Upvotes: 4

Ronak Shah
Ronak Shah

Reputation: 388962

Using base R, we can concatenate all the lists together at same level (list_df). Loop through unique names in list_df and subset them and create a list of matrices of similar named elements.

list_df <- c(list1, list2, list3)

lapply(unique(names(list_df)), function(x) {
     temp <- list_df[names(list_df) == x]
     t(sapply(temp, `[`, seq_len(max(lengths(temp)))))
})

#[[1]]
#  [,1] [,2] [,3] [,4] [,5]
#a    1    2    3    4    5
#a    1    2    3    4   NA
#a    1    2    3   NA   NA

#[[2]]
#  [,1] [,2] [,3] [,4] [,5]
#b    6    7    8    9   10
#b    6    7    8    9   NA
#b    6    7    8   NA   NA

#[[3]]
#  [,1] [,2] [,3] [,4] [,5]
#c   11   12   13   14   15
#c   11   12   13   14   NA
#c   11   12   13   NA   NA

Upvotes: 1

jay.sf
jay.sf

Reputation: 72758

You may use 1. rapply to adjust the lengths of the sublists, and 2. t(mapply) to get the matrices by selecting with '[['.

listn <- list(list1, list2, list3)

setNames(lapply(seq(listn), function(x) 
  t(mapply(`[[`, rapply(listn, `length<-`, value=5, how="list"), x))), names(el(listn)))
# $a
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    1    2    3    4    5
# [2,]    1    2    3    4   NA
# [3,]    1    2    3   NA   NA
# 
# $b
#      [,1] [,2] [,3] [,4] [,5]
# [1,]    6    7    8    9   10
# [2,]    6    7    8    9   NA
# [3,]    6    7    8   NA   NA
# 
# $c
#      [,1] [,2] [,3] [,4] [,5]
# [1,]   11   12   13   14   15
# [2,]   11   12   13   14   NA
# [3,]   11   12   13   NA   NA

In case the lengths are unknown use this code:

max(rapply(listn, length))
# [1] 5

Upvotes: 2

denis
denis

Reputation: 5673

a data.table solution, just for fun:

plouf <- list(list1,list2,list3)

lapply(names(list1),function(name){
  lapply(plouf,function(x){
     as.data.table(t(x[[name]]))
     }) %>% 
    rbindlist(.,fill =T) %>%
`colnames<-`(NULL)
}) %>% setNames(names(list1))


$a

1:  1  2  3  4  5
2:  1  2  3  4 NA
3:  1  2  3 NA NA

$b

1:  6  7  8  9 10
2:  6  7  8  9 NA
3:  6  7  8 NA NA

$c

1: 11 12 13 14 15
2: 11 12 13 14 NA
3: 11 12 13 NA NA

the first loop is on the list name. The second loop loop is on the list of list, and extract the element of each list, transpose it into a data.table with unique row, to be able to use rbindlist which can fill missing columns.

without data.table, so similar but less good than what akrun proposed:

library(plyr)

lapply(names(list1),function(name){
  lapply(plouf,function(x){
    t(x[[name]])%>%
      as.data.frame
  }) %>% 
   rbind.fill %>%
    `colnames<-`(NULL)
}) %>% setNames(names(list1))

Upvotes: 4

Related Questions