Reputation: 482
I want to parse the read.table()
function to a list of .txt files. These files are in my current directory.
my.txt.list <-
list("subject_test.txt", "subject_train.txt", "X_test.txt", "X_train.txt")
Before applying read.table()
to elements of this list, I want to check if the dt has not been already computed and is in a cache directory. dt from cache directory are already in my environment()
, in form of file_name.dt
R> ls()
"subject_test.dt" "subject_train.dt"
In this example, I only want to compute "X_test.txt" and "X_train.txt". I wrote a small function to test if dt has already been cached and apply read.table()
in case not.
my.rt <- function(x,...){
# apply read.table to txt files if data table is not already cached
# x is a character vector
y <- strsplit(x,'.txt')
y <- paste(y,'.dt',sep = '')
if (y %in% ls() == FALSE){
rt <- read.table(x, header = F, sep = "", dec = '.')
}
}
This function works if I take one element this way :
subject_test.dt <- my.rt('subject_test.txt')
Now I want to sapply
to my files list this way:
my.res <- saply(my.txt.list,my.rt)
I have my.res
as a list of df, but the issue is the function compute all files and does take into account already computed files.
I must be missing something, but I can't see why.
TY for suggestions.
Upvotes: 0
Views: 459
Reputation: 1143
I think it has to do with the use of strsplit
in your example. strsplit
returns a list.
What about this?
my.txt.files <- c("subject_test.txt", "subject_train.txt", "X_test.txt", "X_train.txt")
> ls()
[1] "subject_test.dt" "subject_train.dt"
my.rt <- function(x){
y <- gsub(".txt", ".dt", x, fixed = T)
if (!(y %in% ls())) {
read.table(x, header = F, sep = "", dec = '.') }
}
my.res <- sapply(my.txt.files, FUN = my.rt)
Note that I'm replacing .txt with .dt and I'm doing a "not in". You will get NULL
entries in the result list if a file is not processed.
This is untested, but I think it should work...
Upvotes: 1