Gokay
Gokay

Reputation: 75

Combining multiple files while maintaining their identities

Following up from this previous question; consolidating data frames in R using lapply works perfectly to read and combine multiple csv files.

My question is, say you also want to create and append an identifier to this merged dataset based on the filename you are reading in.

For example if you have 2 datasets and they are;

file 2014_1.csv;

Var1 Var2
21   140 
2    134 
3    135 

file 2014_2.csv;

Var1 Var2
21   131 
2    134 

I want my final table to look like this;

Var1 Var2 Period
21   140  2014_1
2    134  2014_1
3    135  2014_1
21   131  2014_2
2    134  2014_2

Is there a way to do this?

Upvotes: 2

Views: 200

Answers (3)

akrun
akrun

Reputation: 886978

Using fread from data.table (which would be faster)

 files <- list.files(pattern="\\d{4}_\\d.csv")
 library(data.table)
 library(tools)


  rbindlist(
     lapply(files, function(x) cbind(fread(x), Period=file_path_sans_ext(x)) ))
  #   Var1 Var2  Period
  #1:   21  140  2014_1
  #2:    2  134  2014_1
  #3:    3  135  2014_1
  #4:   21  131  2014_2
  #5:    2  134  2014_2

Or as suggested by @Arun

 rbindlist(lapply(files, function(x) fread(x)[,Period:=file_path_sans_ext(x)]))

Upvotes: 2

Gokay
Gokay

Reputation: 75

alright, I've figured it out. the following code takes all the csv files in a folder, attaches the file name as an id variable and combines them.

files <- list.files()
files
# read the files into a list of data.frames
data.list <- lapply(files, function(.file){
     dat<-read.csv(.file, header = F)
   dat$period<-as.character(.file)
dat
})
# concatenate into one big data.frame
data.cat <- do.call(rbind, data.list)

Upvotes: 2

ben au
ben au

Reputation: 41

based on your previous post i'd suggest something like this

data.list <- cbind(lapply(files, read.csv), files)

Upvotes: 1

Related Questions