Reputation: 23
I have several data frames in R that have similar names: "datafile_20180801", "datafile_20180802",... --> they all start with 'datafile_' followed by the date of extraction. They have the same column names, how can I stack them into one data frame in R without having to type every single name into rbind? (I'm talking about 30+ data-frames)
Thanks in advance!
Upvotes: 1
Views: 608
Reputation: 269501
The question did not provide reproducible data so we use the reproducible data in the Note at the end. The question states the column names are "similar and we have assumed that means that all data frames have the same column names.
Let env
define the environment where the data frames are located. Below we assume that it is the global environment. (Note that the default for env
in both ls
and mget
is the global environment so another possibility in that case is to just omit the env
argument in both instances.) List the names in that environment that match the indicated regular expression giving Names
. Then get the objects themselves using mget
giving the list L
and bind the components of L
together using rbind
. The row names of the result will indicate which data frame each row came from. No packages are used.
env <- .GlobalEnv
Names <- ls(pattern = "^datafile_\\d{8}$", env)
L <- mget(Names, env)
do.call("rbind", L)
giving the following using the data shown reproducibly in the Note at the end:
Time demand
datafile_20180801.1 1 8.3
datafile_20180801.2 2 10.3
datafile_20180801.3 3 19.0
datafile_20180801.4 4 16.0
datafile_20180801.5 5 15.6
datafile_20180801.6 7 19.8
datafile_20180802.1 10 83.0
datafile_20180802.2 20 103.0
datafile_20180802.3 30 190.0
datafile_20180802.4 40 160.0
datafile_20180802.5 50 156.0
datafile_20180802.6 70 198.0
An alternative to the last line of code above is to use bind_rows
from dplyr. Instead of creating row names that identify the source of each row it will create a new id
column to do that.
library(dplyr)
bind_rows(L, .id = "id")
giving the following but with warnings (which could be avoided if we convert the factor columns to character first):
id Time demand
1 datafile_20180801 1 8.3
2 datafile_20180801 2 10.3
3 datafile_20180801 3 19.0
4 datafile_20180801 4 16.0
5 datafile_20180801 5 15.6
6 datafile_20180801 7 19.8
7 datafile_20180802 10 83.0
8 datafile_20180802 20 103.0
9 datafile_20180802 30 190.0
10 datafile_20180802 40 160.0
11 datafile_20180802 50 156.0
12 datafile_20180802 70 198.0
We could express the above using magrittr if desired:
library(magrittr)
.GlobalEnv %>%
mget(ls(pattern = "^datafile_\\d{8}$", .), .) %>%
do.call("rbind", .)
We have made use of the builtin data frame BOD
to create reproducible input.
datafile_20180801 <- transform(BOD, Time = factor(Time))
datafile_20180802 <- transform(BOD, Time = factor(10*Time), demand = 10 * demand)
Upvotes: 1