Reputation: 51
I'm trying to combine multiple dataframes into one dataframe using bind_rows. Each dataframe has the same column names and length. Let's say each dataframe is named "df" and there are 100 of them. So df1, df2, df3... df100.
I do not want to write each dataframe in the function as bind_rows(df1, df2 ... df100). I tried
total_df <- bind_rows(paste0(df1:df100))
It did not work. Would there be an easier way to do this? Thank you!!
Upvotes: 5
Views: 331
Reputation: 2240
I like using the data.table function rbindlist
with the idcol
parameter. That will show you the source table/df the row came from prior to binding.
I also like rbindlist
because it will give you the option to fill bind tables with some columns added/missing with the fill = TRUE
.
library(data.table)
df1 <- mtcars
df2 <- mtcars
df3 <- mtcars
# Assuming your files are in working memory and have a common naming convention #
my_mem_objects <- lapply(ls(pattern = "df"), get)
# Assign names of the list to the same name they had as memory objects
names(my_mem_objects) <- ls(pattern = "df")
# Combine the dataframes and retain the name of each object the output as new column
alldata <- rbindlist(my_mem_objects,idcol = "Source_DF")
head(alldata)
tail(alldata)
Gives this output:
Source_DF mpg cyl disp hp drat wt qsec vs am gear carb
1: 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
2: 1 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
3: 1 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
4: 1 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
5: 1 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
6: 1 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
and:
Source_DF mpg cyl disp hp drat wt qsec vs am gear carb
1: 3 26.0 4 120.3 91 4.43 2.140 16.7 0 1 5 2
2: 3 30.4 4 95.1 113 3.77 1.513 16.9 1 1 5 2
3: 3 15.8 8 351.0 264 4.22 3.170 14.5 0 1 5 4
4: 3 19.7 6 145.0 175 3.62 2.770 15.5 0 1 5 6
5: 3 15.0 8 301.0 335 3.54 3.570 14.6 0 1 5 8
6: 3 21.4 4 121.0 109 4.11 2.780 18.6 1 1 4 2
Upvotes: 0
Reputation: 7106
library(tidyverse)
#generate some data
rerun(10, iris) %>%
walk2(str_c('df', 1:length(.)), ~assign(.y, .x, pos = .GlobalEnv))
map(str_c('df', 1:10), ~ eval(sym(.x))) %>%
reduce(bind_rows) %>%
as_tibble() #to avoid console flooding
#> # A tibble: 1,500 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5 3.4 1.5 0.2 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
#> # … with 1,490 more rows
#if all the df's are inside a list
data <- rerun(10, iris)
reduce(data, bind_rows) %>%
as_tibble()
#> # A tibble: 1,500 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5 3.4 1.5 0.2 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
#> # … with 1,490 more rows
Created on 2021-06-25 by the reprex package (v2.0.0)
Edit: An alternative way using !!!
(bang-bang-bang).
.args <- str_c('df', 1:10) %>%
map(~eval(sym(.x)))
bind_rows(!!!.args) %>%
as_tibble()
#> # A tibble: 1,500 x 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 5.1 3.5 1.4 0.2 setosa
#> 2 4.9 3 1.4 0.2 setosa
#> 3 4.7 3.2 1.3 0.2 setosa
#> 4 4.6 3.1 1.5 0.2 setosa
#> 5 5 3.6 1.4 0.2 setosa
#> 6 5.4 3.9 1.7 0.4 setosa
#> 7 4.6 3.4 1.4 0.3 setosa
#> 8 5 3.4 1.5 0.2 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 4.9 3.1 1.5 0.1 setosa
#> # … with 1,490 more rows
Upvotes: 1
Reputation: 124013
The easier way would be to avoid having a bunch of dataframes named df1, ..., df100 from the start by making use of a list. Be that as it may. You could make use of lapply
and get
to put your df's into a list and call bind_rows
on the list:
library(dplyr)
df1 <- mtcars
df2 <- mtcars
df3 <- mtcars
df_bind <- lapply(1:3, function(x) get(paste0("df", x))) %>%
bind_rows()
head(df_bind)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4...1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag...2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#> Datsun 710...3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#> Hornet 4 Drive...4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> Hornet Sportabout...5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
#> Valiant...6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
or using mget
you could do
mget(paste0("df", 1:3)) %>% bind_rows()
Upvotes: 6