tidyverse: binding list elements efficiently

Question

I want to bind data.frames of same number of rows from a list as given below.

df1 <- data.frame(A1 = 1:10, B1 = 11:20)
df2 <- data.frame(A1 = 1:10, C1 = 21:30)
df3 <- data.frame(A2 = 1:15, B2 = 11:25, C2 = 31:45)
df4 <- data.frame(A2 = 1:15, D2 = 11:25, E2 = 51:65)
df5 <- 5

ls <- list(df1, df2, df3, df4, df5)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
bind_cols(ls[1], ls[2], .id = NULL)
#> New names:
#> * A1 -> A1...1
#> * A1 -> A1...3
#>    A1...1 B1 A1...3 C1
#> 1       1 11      1 21
#> 2       2 12      2 22
#> 3       3 13      3 23
#> 4       4 14      4 24
#> 5       5 15      5 25
#> 6       6 16      6 26
#> 7       7 17      7 27
#> 8       8 18      8 28
#> 9       9 19      9 29
#> 10     10 20     10 30
bind_cols(ls[3], ls[4], .id = NULL)
#> New names:
#> * A2 -> A2...1
#> * A2 -> A2...4
#>    A2...1 B2 C2 A2...4 D2 E2
#> 1       1 11 31      1 11 51
#> 2       2 12 32      2 12 52
#> 3       3 13 33      3 13 53
#> 4       4 14 34      4 14 54
#> 5       5 15 35      5 15 55
#> 6       6 16 36      6 16 56
#> 7       7 17 37      7 17 57
#> 8       8 18 38      8 18 58
#> 9       9 19 39      9 19 59
#> 10     10 20 40     10 20 60
#> 11     11 21 41     11 21 61
#> 12     12 22 42     12 22 62
#> 13     13 23 43     13 23 63
#> 14     14 24 44     14 24 64
#> 15     15 25 45     15 25 65

In my actual list, I have about twenty data.frames of different number of rows. I wonder if there is a more efficient way of binding data.frames of same number of rows without giving the name and index of list elements.

akrun · Accepted Answer

It is easier to do this by splitting. Create a grouping index with gl

grp <- as.integer(gl(length(ls), 2, length(ls)))

and then use split

library(dplyr)
library(purrr)
library(stringr)
split(ls, grp) %>% # // split by the grouping index 
     map(bind_cols) %>% # // loop over the `list` and use `bind_cols`
     set_names(str_c('df', seq_along(.))) %>% # // name the `list`
     list2env(.GlobalEnv) # // create objects in global env

-output

head(df1)
#  A1...1 B1 A1...3 C1
#1      1 11      1 21
#2      2 12      2 22
#3      3 13      3 23
#4      4 14      4 24
#5      5 15      5 25
#6      6 16      6 26

head(df2)
#  A2...1 B2 C2 A2...4 D2 E2
#1      1 11 31      1 11 51
#2      2 12 32      2 12 52
#3      3 13 33      3 13 53
#4      4 14 34      4 14 54
#5      5 15 35      5 15 55
#6      6 16 36      6 16 56

head(df3)
# A tibble: 1 x 1
#   ...1
#  
#1     5

NOTE:

It is better to keep the elements in the list instead of creating objects in the global environment i.e. list2env
ls is a function name and naming an object with function name is not a good option as it can lead to buggy situations

tidyverse: binding list elements efficiently

Answers (2)

Related Questions