Electrino
Electrino

Reputation: 2890

How to merge datasets in a list by rows in R?

Similar questions have been asked here and here and here. However, none seem to help my specific situation. Im trying to merge a bunch of datasets (that are in a list) and turn it into a matrix. But Im trying to merge them by row. So, for example, if we have some data that looks like this:

set.seed(100)

dfList <- NULL
for(i in 1:3){
  dfList[[i]] <- data.frame(
    x1 = sample(1:10, 3, replace = T),
    x2 = sample(1:10, 3, replace = T)
    )
}

> dfList
[[1]]
  x1 x2
1 10  3
2  7  9
3  6 10

[[2]]
  x1 x2
1  7  4
2  6  7
3  6  6

[[3]]
  x1 x2
1  2  7
2  7  8
3  7  2

I am trying to merge the datasets by row and turn it into a matrix. What I mean is, the 1st row of my new matrix will come from the 1st row of the 1st data frame in the list. The 2nd row of my new matrix will come from the 1st row of the 2nd data frame in the list... and so on.

So, using the above example, my desired output would look like:

      x1 x2
 [1,] 10  3
 [2,]  7  4
 [3,]  2  7
 [4,]  7  9
 [5,]  6  7
 [6,]  7  8
 [7,]  6 10
 [8,]  6  6
 [9,]  7  2

Any suggestions as to how I could do this?

Upvotes: 0

Views: 192

Answers (5)

G. Grothendieck
G. Grothendieck

Reputation: 269824

Use abind like this:

library(abind)
nc <- ncol(dfList[[1]])
matrix(t(abind(dfList)), ncol = nc, byrow = TRUE)

giving:

      [,1] [,2]
 [1,]   10    3
 [2,]    7    4
 [3,]    2    7
 [4,]    7    9
 [5,]    6    7
 [6,]    7    8
 [7,]    6   10
 [8,]    6    6
 [9,]    7    2

or with only base R:

nc <- ncol(dfList[[1]])
matrix(t(do.call("cbind", dfList)), ncol = nc, byrow = TRUE)

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389065

Here's my attempt -

library(dplyr)

lapply(dfList, asplit, 1) %>% purrr::transpose() %>% bind_rows()

#    x1    x2
#  <int> <int>
#1    10     3
#2     7     4
#3     2     7
#4     7     9
#5     6     7
#6     7     8
#7     6    10
#8     6     6
#9     7     2

If you need a matrix as output you can add %>% as.matrix() to the chain.

Upvotes: 0

IceCreamToucan
IceCreamToucan

Reputation: 28695

Two other ways to order by row number after an rbind.

set.seed(100)

dfList <- NULL
for(i in 1:3){
  dfList[[i]] <- data.frame(
    x1 = sample(1:10, 3, replace = T),
    x2 = sample(1:10, 3, replace = T)
    )
}

library(data.table)
rbindlist(dfList, idcol = 'x')[order(rowid(x)), -'x']
#>       x1    x2
#>    <int> <int>
#> 1:    10     3
#> 2:     7     4
#> 3:     2     7
#> 4:     7     9
#> 5:     6     7
#> 6:     7     8
#> 7:     6    10
#> 8:     6     6
#> 9:     7     2

Created on 2022-03-01 by the reprex package (v2.0.1)

set.seed(100)

dfList <- NULL
for(i in 1:3){
  dfList[[i]] <- data.frame(
    x1 = sample(1:10, 3, replace = T),
    x2 = sample(1:10, 3, replace = T)
    )
}

do.call(rbind, dfList)[order(unlist(lapply(dfList, function(x) seq(nrow(x))))),]
#>   x1 x2
#> 1 10  3
#> 4  7  4
#> 7  2  7
#> 2  7  9
#> 5  6  7
#> 8  7  8
#> 3  6 10
#> 6  6  6
#> 9  7  2

Created on 2022-03-01 by the reprex package (v2.0.1)

Upvotes: 0

VFreguglia
VFreguglia

Reputation: 2311

do.call(rbind, lapply(1:nrow(dfList[[1]]), function(x){
  do.call(rbind, lapply(dfList, function(y) y[x,]))
  }))

   x1 x2
1  10  3
2   7  4
3   2  7
23  7  9
21  6  7
22  7  8
33  6 10
31  6  6
32  7  2

Achieves what you want assuming all data.frames have the same number of rows.

The rules are not clear in the question for the case where the number of rows may differ.

Upvotes: 1

deschen
deschen

Reputation: 10996

You can do:

library(tidyverse)
new_matrix <- lapply(seq_along(dfList),
                     function(x) {dfList[[x]] <- dfList[[x]] %>% mutate(id1 = 1:n(), id2 = x)}) %>%
  bind_rows() %>%
  arrange(id1, id2) %>%
  select(-id1, -id2) %>%
  as.matrix()



     x1 x2
 [1,] 10  3
 [2,]  7  4
 [3,]  2  7
 [4,]  7  9
 [5,]  6  7
 [6,]  7  8
 [7,]  6 10
 [8,]  6  6
 [9,]  7  2

Upvotes: 1

Related Questions