Reputation: 1507
I have an output directory from dbcans with each sample output in a subdirectory. I need to loop over each subdrectory are read into R a file called overview.csv.
for (subdir in list.dirs(recursive = FALSE)){
data = read.csv(file.path(~\\subdir, "overview.csv"))
}
I am unsure how to deal with the changing filepath in read.csv
for each subdir. Any help would be appriciated.
Upvotes: 0
Views: 90
Reputation: 160417
Up front, the ~\\subdir
(not as a string) is obviously problematic. Since subdir
is already a string, using file.path
is correct but with just the variable. If you are concerned about relative versus absolute, you can always normalize the paths with normalizePath(list.dirs())
, though this does not really change things if you use `
A few things to consider.
Constantly reassigning to the same variable doesn't help, so either you need to assign to an element of a list
or something else (e.g., lapply
, below). (I also think data
as a variable name is problematic. While it works just fine "now", if you ever run part of your script without assigning to data
, you will be referencing the function, resulting in possibly confusing errors such as Error in data$a : object of type 'closure' is not subsettable
; since a closure
is really just a function with its enclosing namespace/environment, this is just saying "you tried to do something to a function".)
I think both pattern=
and full.names=
might be useful to switch from using list.dirs
to list.files
, such as
datalist <- list()
# I hope recursion doesn't go too deep here
filelist <- list.files(pattern = "overview.csv", full.names = TRUE, recursive = TRUE)
for (ind in seq_along(filelist)) {
datalist[[ind]] <- read.csv(filelist[ind])
}
# perhaps combine into one frame
data1 <- do.call(rbind, datalist)
Reading in lots of files and doing them same thing to all of them suggests lapply
. This is a little more compact version of number 2:
filelist <- list.files(pattern = "overview.csv", recursive = TRUE, full.names = TRUE)
datalist <- lapply(filelist, read.csv)
data1 <- do.call(rbind, datalist)
Note: if you really only need precisely one level of subdirs, you can work around that with:
filelist <- list.files(list.dirs(somepath, recursive = FALSE),
pattern = "overview.csv", full.names = TRUE)
or you can limit to no more than some depth, perhaps with list.dirs.depth.n
from https://stackoverflow.com/a/48300309.
Upvotes: 2
Reputation: 8117
I think it should be this.
for (subdir in list.dirs(recursive = FALSE)){
data = read.csv(paste0(subdir, "overview.csv"))
}
Upvotes: 1