Reputation: 29
I'm trying to extract the same first 16 columns of data from many csv files that are in different sub-directories and add the csv file names to each row of the final csv. My code:
getwd()
root<-list.dirs(".", recursive=TRUE)
# get list of files ending in csv in directory root
dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>%
# read files into data frames
lapply(FUN = read.csv) %>%
# bind all data frames into a single data frame
rbind_all %>%
# write into a single csv file
write.csv("all.csv")
I'd like to know where to put the select columns and add file names code.
ANSWER:
getwd()
root<-list.dirs(".", recursive=TRUE)
# get list of files ending in csv in directory root
dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>%
# read files into data frames, select first 16 columns and add filename
lapply(FUN = function(p) read.csv(p) %>% select(1:16) %>%
mutate(file_name=p)) %>%
# bind all data frames into a single data frame
rbind_all %>%
# write into a single csv file
write.csv("all.csv")
Upvotes: 1
Views: 182
Reputation: 19867
You should do it at the time where you use lapply, since this is the last step where you can access file name/path:
dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>%
lapply(FUN = function(p) read.csv(p) %>% select(1:16) %>% mutate(file_name=p)) %>%
bind_rows() %>%
write.csv("all.csv")
Upvotes: 2