MYaseen208
MYaseen208

Reputation: 23898

tidyverse: binding list elements efficiently

I want to bind data.frames of same number of rows from a list as given below.

df1 <- data.frame(A1 = 1:10, B1 = 11:20)
df2 <- data.frame(A1 = 1:10, C1 = 21:30)
df3 <- data.frame(A2 = 1:15, B2 = 11:25, C2 = 31:45)
df4 <- data.frame(A2 = 1:15, D2 = 11:25, E2 = 51:65)
df5 <- 5

ls <- list(df1, df2, df3, df4, df5)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
bind_cols(ls[1], ls[2], .id = NULL)
#> New names:
#> * A1 -> A1...1
#> * A1 -> A1...3
#>    A1...1 B1 A1...3 C1
#> 1       1 11      1 21
#> 2       2 12      2 22
#> 3       3 13      3 23
#> 4       4 14      4 24
#> 5       5 15      5 25
#> 6       6 16      6 26
#> 7       7 17      7 27
#> 8       8 18      8 28
#> 9       9 19      9 29
#> 10     10 20     10 30
bind_cols(ls[3], ls[4], .id = NULL)
#> New names:
#> * A2 -> A2...1
#> * A2 -> A2...4
#>    A2...1 B2 C2 A2...4 D2 E2
#> 1       1 11 31      1 11 51
#> 2       2 12 32      2 12 52
#> 3       3 13 33      3 13 53
#> 4       4 14 34      4 14 54
#> 5       5 15 35      5 15 55
#> 6       6 16 36      6 16 56
#> 7       7 17 37      7 17 57
#> 8       8 18 38      8 18 58
#> 9       9 19 39      9 19 59
#> 10     10 20 40     10 20 60
#> 11     11 21 41     11 21 61
#> 12     12 22 42     12 22 62
#> 13     13 23 43     13 23 63
#> 14     14 24 44     14 24 64
#> 15     15 25 45     15 25 65

In my actual list, I have about twenty data.frames of different number of rows. I wonder if there is a more efficient way of binding data.frames of same number of rows without giving the name and index of list elements.

Upvotes: 2

Views: 418

Answers (2)

akrun
akrun

Reputation: 887223

It is easier to do this by splitting. Create a grouping index with gl

grp <- as.integer(gl(length(ls), 2, length(ls)))

and then use split

library(dplyr)
library(purrr)
library(stringr)
split(ls, grp) %>% # // split by the grouping index 
     map(bind_cols) %>% # // loop over the `list` and use `bind_cols`
     set_names(str_c('df', seq_along(.))) %>% # // name the `list`
     list2env(.GlobalEnv) # // create objects in global env

-output

head(df1)
#  A1...1 B1 A1...3 C1
#1      1 11      1 21
#2      2 12      2 22
#3      3 13      3 23
#4      4 14      4 24
#5      5 15      5 25
#6      6 16      6 26

head(df2)
#  A2...1 B2 C2 A2...4 D2 E2
#1      1 11 31      1 11 51
#2      2 12 32      2 12 52
#3      3 13 33      3 13 53
#4      4 14 34      4 14 54
#5      5 15 35      5 15 55
#6      6 16 36      6 16 56

head(df3)
# A tibble: 1 x 1
#   ...1
#  <dbl>
#1     5

NOTE:

  1. It is better to keep the elements in the list instead of creating objects in the global environment i.e. list2env
  2. ls is a function name and naming an object with function name is not a good option as it can lead to buggy situations

Upvotes: 2

Duck
Duck

Reputation: 39605

Maybe not the optimal approach but you can use a loop and bind the dataframes with same number of columns into a new dataframes. The main of this code is to check the dimension of each dataframe and create an unique vector. Then in the loop you can use lapply() to subset the dataframes in ls and the bind their columns. Here the code (Updated considering the little df5, you can make the trick managing it as a dataframe):

library(dplyr)
#Data
df1 <- data.frame(A1 = 1:10, B1 = 11:20)
df2 <- data.frame(A1 = 1:10, C1 = 21:30)
df3 <- data.frame(A2 = 1:15, B2 = 11:25, C2 = 31:45)
df4 <- data.frame(A2 = 1:15, D2 = 11:25, E2 = 51:65)
df5 <- 5
#List
ls <- list(df1, df2, df3, df4,df5)
#Index
index <- sapply(ls,function(x)dim(as.data.frame(x))[1])
m <- unique(index)
#Loop
for(i in 1:length(m))
{
  assign(paste0('df',i),do.call(bind_cols,ls[lapply(ls,function(x) dim(as.data.frame(x))[1]==m[i])==T]))
}

Output:

df1
   A1...1 B1 A1...3 C1
1       1 11      1 21
2       2 12      2 22
3       3 13      3 23
4       4 14      4 24
5       5 15      5 25
6       6 16      6 26
7       7 17      7 27
8       8 18      8 28
9       9 19      9 29
10     10 20     10 30

df2
   A2...1 B2 C2 A2...4 D2 E2
1       1 11 31      1 11 51
2       2 12 32      2 12 52
3       3 13 33      3 13 53
4       4 14 34      4 14 54
5       5 15 35      5 15 55
6       6 16 36      6 16 56
7       7 17 37      7 17 57
8       8 18 38      8 18 58
9       9 19 39      9 19 59
10     10 20 40     10 20 60
11     11 21 41     11 21 61
12     12 22 42     12 22 62
13     13 23 43     13 23 63
14     14 24 44     14 24 64
15     15 25 45     15 25 65

df3
  ...1
1    5

Upvotes: 2

Related Questions