Reputation: 9796
I have multiple Excel files, each with two columns. One being the same and the other being different for different files.
How can I merge all the files by their common column and save them as a new file?
Note:
I don't wish to combine them one under the other (using rbind
). Instead, I want to merge
them based on the commun column.
For R:
I have filenames in the following format.
percent- 15 knt 03_01__00_51.csv
percent- 20 knt 03_01__00_54.csv
percent- 25 knt 03_01__00_57.csv
Due to the file names format I can't write a script to read each file individually. I don't know how to write a loop that iterates over just 15 ,20 .... and leaves the end part somehow.
Upvotes: 0
Views: 4027
Reputation: 70256
You can probably do something like:
# if those are the only files in the folder, you don't even need a pattern
filelist <- list.files(pattern = "^percent.*\\.csv$") # read all file names according to pattern
files <- lapply(filelist, read.csv, header=TRUE) # read all files in filelist
files <- lapply(files, function(x) x[-1]) # remove first column of each file
DF = Reduce(function(...) merge(..., by = "CommonColumn", all=T), files) # merge all files
x <- sub("^(percent- )(\\d+)(\\s.*)$", "\\2", filelist) # get the file name numbers
names(DF[-1]) <- paste(names(DF[-1]), x, sep = "-") # add file name numbers to column names in DF
write.csv(DF, "myfile.csv") # write data to new file
The Reduce()
part is taken from here.
Upvotes: 2