EJrandom
EJrandom

Reputation: 29

Selecting specific columns and adding csv names to final csv file

I'm trying to extract the same first 16 columns of data from many csv files that are in different sub-directories and add the csv file names to each row of the final csv. My code:

getwd()
root<-list.dirs(".", recursive=TRUE)
# get list of files ending in csv in directory root
dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>%
# read files into data frames
lapply(FUN = read.csv) %>%
# bind all data frames into a single data frame
rbind_all %>%
# write into a single csv file
write.csv("all.csv")

I'd like to know where to put the select columns and add file names code.

ANSWER:

getwd()
root<-list.dirs(".", recursive=TRUE)
# get list of files ending in csv in directory root
dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>%
# read files into data frames, select first 16 columns and add filename 
lapply(FUN = function(p) read.csv(p) %>% select(1:16) %>%    

mutate(file_name=p)) %>%     
# bind all data frames into a single data frame
rbind_all %>%
# write into a single csv file
write.csv("all.csv")

Upvotes: 1

Views: 182

Answers (1)

scoa
scoa

Reputation: 19867

You should do it at the time where you use lapply, since this is the last step where you can access file name/path:

dir(root, pattern='csv$', recursive = TRUE, full.names = TRUE) %>%
  lapply(FUN = function(p) read.csv(p) %>% select(1:16) %>% mutate(file_name=p)) %>%
  bind_rows() %>%
  write.csv("all.csv")

Upvotes: 2

Related Questions