gabx
gabx

Reputation: 482

R 3.1 sapply to a list of files

I want to parse the read.table() function to a list of .txt files. These files are in my current directory.

 my.txt.list <-
 list("subject_test.txt", "subject_train.txt", "X_test.txt", "X_train.txt")

Before applying read.table() to elements of this list, I want to check if the dt has not been already computed and is in a cache directory. dt from cache directory are already in my environment(), in form of file_name.dt

 R> ls()
 "subject_test.dt"  "subject_train.dt"

In this example, I only want to compute "X_test.txt" and "X_train.txt". I wrote a small function to test if dt has already been cached and apply read.table()in case not.

 my.rt <- function(x,...){
 # apply read.table to txt files if data table is not already cached
 # x is a character vector
 y <- strsplit(x,'.txt')
 y <- paste(y,'.dt',sep = '')
 if (y %in% ls() == FALSE){
     rt <- read.table(x, header = F, sep = "", dec = '.') 
}        
}

This function works if I take one element this way :

 subject_test.dt <- my.rt('subject_test.txt')

Now I want to sapply to my files list this way:

 my.res <- saply(my.txt.list,my.rt)

I have my.resas a list of df, but the issue is the function compute all files and does take into account already computed files.

I must be missing something, but I can't see why.

TY for suggestions.

Upvotes: 0

Views: 459

Answers (1)

swolf
swolf

Reputation: 1143

I think it has to do with the use of strsplit in your example. strsplit returns a list.

What about this?

my.txt.files <- c("subject_test.txt", "subject_train.txt", "X_test.txt", "X_train.txt")
> ls()
[1] "subject_test.dt"  "subject_train.dt"
my.rt <- function(x){
  y <- gsub(".txt", ".dt", x, fixed = T)
  if (!(y %in% ls())) {
    read.table(x, header = F, sep = "", dec = '.') }        
}
my.res <- sapply(my.txt.files, FUN = my.rt)

Note that I'm replacing .txt with .dt and I'm doing a "not in". You will get NULL entries in the result list if a file is not processed.

This is untested, but I think it should work...

Upvotes: 1

Related Questions