Anna
Anna

Reputation: 51

Very simple yet confusing R question about bind_rows()

I'm trying to combine multiple dataframes into one dataframe using bind_rows. Each dataframe has the same column names and length. Let's say each dataframe is named "df" and there are 100 of them. So df1, df2, df3... df100.

I do not want to write each dataframe in the function as bind_rows(df1, df2 ... df100). I tried

total_df <- bind_rows(paste0(df1:df100))

It did not work. Would there be an easier way to do this? Thank you!!

Upvotes: 5

Views: 331

Answers (3)

akaDrHouse
akaDrHouse

Reputation: 2240

I like using the data.table function rbindlist with the idcol parameter. That will show you the source table/df the row came from prior to binding.

I also like rbindlist because it will give you the option to fill bind tables with some columns added/missing with the fill = TRUE.

library(data.table)

df1 <- mtcars 
df2 <- mtcars 
df3 <- mtcars 
    
# Assuming your files are in working memory and have a common naming convention #
my_mem_objects <- lapply(ls(pattern = "df"), get)

# Assign names of the list to the same name they had as memory objects
names(my_mem_objects) <- ls(pattern = "df")

# Combine the dataframes and retain the name of each object the output as new column
alldata <- rbindlist(my_mem_objects,idcol = "Source_DF")

head(alldata)
tail(alldata)

Gives this output:

   Source_DF  mpg cyl disp  hp drat    wt  qsec vs am gear carb
1:         1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
2:         1 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
3:         1 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
4:         1 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
5:         1 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
6:         1 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

and:

   Source_DF  mpg cyl  disp  hp drat    wt qsec vs am gear carb
1:         3 26.0   4 120.3  91 4.43 2.140 16.7  0  1    5    2
2:         3 30.4   4  95.1 113 3.77 1.513 16.9  1  1    5    2
3:         3 15.8   8 351.0 264 4.22 3.170 14.5  0  1    5    4
4:         3 19.7   6 145.0 175 3.62 2.770 15.5  0  1    5    6
5:         3 15.0   8 301.0 335 3.54 3.570 14.6  0  1    5    8
6:         3 21.4   4 121.0 109 4.11 2.780 18.6  1  1    4    2

Upvotes: 0

jpdugo17
jpdugo17

Reputation: 7106

library(tidyverse)

#generate some data
rerun(10, iris) %>% 
    walk2(str_c('df', 1:length(.)), ~assign(.y, .x, pos = .GlobalEnv))

map(str_c('df', 1:10), ~ eval(sym(.x))) %>%
    reduce(bind_rows) %>%
    as_tibble() #to avoid console flooding
#> # A tibble: 1,500 x 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>           <dbl>       <dbl>        <dbl>       <dbl> <fct>  
#>  1          5.1         3.5          1.4         0.2 setosa 
#>  2          4.9         3            1.4         0.2 setosa 
#>  3          4.7         3.2          1.3         0.2 setosa 
#>  4          4.6         3.1          1.5         0.2 setosa 
#>  5          5           3.6          1.4         0.2 setosa 
#>  6          5.4         3.9          1.7         0.4 setosa 
#>  7          4.6         3.4          1.4         0.3 setosa 
#>  8          5           3.4          1.5         0.2 setosa 
#>  9          4.4         2.9          1.4         0.2 setosa 
#> 10          4.9         3.1          1.5         0.1 setosa 
#> # … with 1,490 more rows

#if all the df's are inside a list

data <- rerun(10, iris)

reduce(data, bind_rows) %>% 
    as_tibble()
#> # A tibble: 1,500 x 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>           <dbl>       <dbl>        <dbl>       <dbl> <fct>  
#>  1          5.1         3.5          1.4         0.2 setosa 
#>  2          4.9         3            1.4         0.2 setosa 
#>  3          4.7         3.2          1.3         0.2 setosa 
#>  4          4.6         3.1          1.5         0.2 setosa 
#>  5          5           3.6          1.4         0.2 setosa 
#>  6          5.4         3.9          1.7         0.4 setosa 
#>  7          4.6         3.4          1.4         0.3 setosa 
#>  8          5           3.4          1.5         0.2 setosa 
#>  9          4.4         2.9          1.4         0.2 setosa 
#> 10          4.9         3.1          1.5         0.1 setosa 
#> # … with 1,490 more rows

Created on 2021-06-25 by the reprex package (v2.0.0)

Edit: An alternative way using !!! (bang-bang-bang).

.args <- str_c('df', 1:10) %>%
    map(~eval(sym(.x)))

bind_rows(!!!.args) %>% 
    as_tibble()
#> # A tibble: 1,500 x 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>           <dbl>       <dbl>        <dbl>       <dbl> <fct>  
#>  1          5.1         3.5          1.4         0.2 setosa 
#>  2          4.9         3            1.4         0.2 setosa 
#>  3          4.7         3.2          1.3         0.2 setosa 
#>  4          4.6         3.1          1.5         0.2 setosa 
#>  5          5           3.6          1.4         0.2 setosa 
#>  6          5.4         3.9          1.7         0.4 setosa 
#>  7          4.6         3.4          1.4         0.3 setosa 
#>  8          5           3.4          1.5         0.2 setosa 
#>  9          4.4         2.9          1.4         0.2 setosa 
#> 10          4.9         3.1          1.5         0.1 setosa 
#> # … with 1,490 more rows

Upvotes: 1

stefan
stefan

Reputation: 124013

The easier way would be to avoid having a bunch of dataframes named df1, ..., df100 from the start by making use of a list. Be that as it may. You could make use of lapply and get to put your df's into a list and call bind_rows on the list:

library(dplyr)

df1 <- mtcars
df2 <- mtcars
df3 <- mtcars

df_bind <- lapply(1:3, function(x) get(paste0("df", x))) %>% 
  bind_rows()

head(df_bind)
#>                        mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4...1         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag...2     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710...3        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive...4    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Hornet Sportabout...5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Valiant...6           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

or using mget you could do

mget(paste0("df", 1:3)) %>% bind_rows()

Upvotes: 6

Related Questions