user2716568
user2716568

Reputation: 1946

How to read in multiple .csv files, merge and create new column?

I wish to read in multiple .csv files from a folder, located on my desktop, merge all these .csv files into one data.frame and create a column that contains the name of each file.

I have tried:

my.path <- "C:/Users/JB/Desktop/RawFiles/"
filenames <- paste(my.path, list.files(path=my.path), sep="/")
my.df <- do.call(rbind, lapply(filenames, function(x) cbind(read.csv(x), name=strsplit(x,'\\.')[[1]][1])))

However, my.dt$name doesn't contain the filename of the .csv. Rather, it contains the location of the file. For example, head(my.dt, 3) produces:

 name
1 C:/Users/JB/Desktop/RawFiles//Analysis_01122016
2 C:/Users/JB/Desktop/RawFiles//Analysis_01122016
3 C:/Users/JB/Desktop/RawFiles//Analysis_01122016

How can I edit this so my.dt$name only produces the file.name? My anticipated output would be:

name
    1 Analysis_01122016
    2 Analysis_01122016
    3 Analysis_01122016

Thanks!

Upvotes: 0

Views: 1033

Answers (2)

CJ Battey
CJ Battey

Reputation: 81

You can split the file path on "/" with strsplit(), and grab the last element of the resulting vector.

files <- list.files("~/Dropbox/structure_simulations/str_in/k2",full.names = T)

df <- lapply(files,function(e) {
  tmp <- read.csv(e)
  name <- unlist(strsplit(e,"/"))[length(unlist(strsplit(e,"/")))] 
  name <- unlist(strsplit(name,"\\."))[1]
  tmp$name <- name
  tmp
  })
df <- do.call(rbind,df)

Upvotes: 0

neilfws
neilfws

Reputation: 33782

First, you don't need to use a variable for the path; the full.names = TRUE argument to list.files will do that for you.

If you want just the file name and extension, you can use:

basename("full/path/to/myfile.csv")

If you want to remove the extension too:

tools::file_path_sans_ext(basename("full/path/to/myfile.csv"))

So you can do something like this with your code:

filenames <- list.files(path = "C:/Users/JB/Desktop/RawFiles", full.names = TRUE)
my.df <- do.call(rbind,
                 lapply(filenames, function(x) 
                                   cbind(read.csv(x), 
                                   name = tools::file_path_sans_ext(basename(x)))))

But there are cleaner solutions using tools such as plyr::ldply or data.table and you'll find examples in other answers with the right search.

Upvotes: 1

Related Questions