Reputation: 563
I have a large number of files (>50,000) to analyze. I can get a list of these files with;
myfiles <- list.files(pattern="*output*")
and then loop with
for (file in myfiles) {
"code"
}
The problem is that sometimes my system freezes due to RAM overload so the only option left is to kill the rsession and restart the loop again with same files. How can I modify the list.files call, so that I can only select a certain number of files like 100:200 or 3500:5000 via list.files. Basically, I would like to skip the files which are already analyzed before the last system freeze.
Any help would be appreciated.
Thanks.
Upvotes: 1
Views: 39
Reputation: 887531
The 'myfiles' objects is a vector
. So, we can create the sequence (:
) of positions to subset the object when we loop
for (file in myfiles[100:200]) {
...code...
}
Also, the files can be split
into a list
with each element of length 100
lst1 <- split(myfiles, as.integer(gl(length(myfiles), 100, length(myfiles))))
Then, an idea is to loop in parallel
or sequentially, remove (rm
) the temporary object, call gc()
to release memory
Upvotes: 2