Reputation: 23
I am working on a project where the first step involves merging together a large number of dataframes.
What I have so far imports all the .csv files in a directory containing outputs from an access database. These are data collected using different methods and split by the year of data collection. All of this metadata is included in the filename: Gap.2013.csv is the csv containing all Gap-Intercept data from 2013, SR.2014.csv contains Species Richness data from 2014.
Next, a block of repetitive code creates a column designating the 'year' variable and rbinds like data types together.
Sample code as follows
setwd("AIMRD Exports/CSV")
list.filenames <- list.files(pattern="*.csv")
for (i in 1:length(list.filenames)) {
assign(list.filenames[i],
read.csv(paste(list.filenames[i], sep='')))}
Gap.2013.csv$Year <- 2013
SR.2013.csv$Year <- 2013
Gap.2014.csv$Year <- 2014
SR.2014.csv$Year <- 2014
Gap.2015.csv$Year <- 2015
SR.2015.csv$Year <- 2015
Gap <- rbind (Gap.2013.csv, Gap.2014.csv, Gap.2015.csv)
SR <- rbind (SR.2013.csv, SR.2014.csv, SR.2015.csv)
Does anyone have any suggestions for how to cut down on the repetition? My first though was to somehow modify the loop at the top and use list.files(pattern = x), but no luck so far.
Upvotes: 2
Views: 90
Reputation: 567
I'd suggest keeping your first two lines where you get your list of files. Then you can write a function that breaks these out.
library(plyr)
library(stringr)
myFun <- function(files, method) {
files <- files[grep(method, files)] #Get a list of files for one type of method.
dat <- mdply(files,
function(file) {
year <- str_extract(file, "\\d{4}")
iDat <- read.csv(file, stringsAsFactors=FALSE)
iDat$Year <- year
return(iDat)
})
return(dat)
}
Gap <- myFun(list.files, 'Gap') #method argument is case-sensitive
SR <- myFun(list.files, 'SR')
Upvotes: 1