Reputation: 75
Following up from this previous question; consolidating data frames in R using lapply works perfectly to read and combine multiple csv files.
My question is, say you also want to create and append an identifier to this merged dataset based on the filename you are reading in.
For example if you have 2 datasets and they are;
file 2014_1.csv;
Var1 Var2
21 140
2 134
3 135
file 2014_2.csv;
Var1 Var2
21 131
2 134
I want my final table to look like this;
Var1 Var2 Period
21 140 2014_1
2 134 2014_1
3 135 2014_1
21 131 2014_2
2 134 2014_2
Is there a way to do this?
Upvotes: 2
Views: 200
Reputation: 886978
Using fread
from data.table
(which would be faster)
files <- list.files(pattern="\\d{4}_\\d.csv")
library(data.table)
library(tools)
rbindlist(
lapply(files, function(x) cbind(fread(x), Period=file_path_sans_ext(x)) ))
# Var1 Var2 Period
#1: 21 140 2014_1
#2: 2 134 2014_1
#3: 3 135 2014_1
#4: 21 131 2014_2
#5: 2 134 2014_2
Or as suggested by @Arun
rbindlist(lapply(files, function(x) fread(x)[,Period:=file_path_sans_ext(x)]))
Upvotes: 2
Reputation: 75
alright, I've figured it out. the following code takes all the csv files in a folder, attaches the file name as an id variable and combines them.
files <- list.files()
files
# read the files into a list of data.frames
data.list <- lapply(files, function(.file){
dat<-read.csv(.file, header = F)
dat$period<-as.character(.file)
dat
})
# concatenate into one big data.frame
data.cat <- do.call(rbind, data.list)
Upvotes: 2
Reputation: 41
based on your previous post i'd suggest something like this
data.list <- cbind(lapply(files, read.csv), files)
Upvotes: 1