user10023443
user10023443

Reputation:

read text files

I wonder if this code reads a set of text files and resaves them with the same name. When I tested it, I found that the list generated in the reading phase is empty. This is my code:

library('textreadr')
path <- ("C:/testnum/")
files <- list.files(path=path, pattern="*.txt") 
lines<-list()

for (i in 1:length(files)){
    lines[[i]] <- scan(files[i])
  }

lines[i]<-lapply(names(files), function(x) 
  writeLines(lines[x], file=paste(path, x, ".txt", sep = "")))

Upvotes: 0

Views: 180

Answers (1)

r2evans
r2evans

Reputation: 160407

Several things to correct:

  • list.files returns an unnamed character vector, so names(files) will be NULL
  • list.files is currently returning just file names, not the full path to read them, so your scan calls will only work if your working directory has files with the same names; it is much better to be defensive and incorporate the full path into the filenames
  • your use of lapply(files, function(i) writeLines(files[i], ...)) is missing the point that i is each filename, not an index into the vector
  • scan does its own open/close if you give it a filename, so we can simplify the code there
  • writeLines returns NULL, not sure why you'd want to capture that into lines[i] (if i had a meaningful value even)
  • None of the code you're using needs any of the packages you're loading. Not certain if you assume you need them for this functionality, or if you are using them elsewhere (in which case they should not be in the question).

Try this:

path <- ("C:/Users/abidi/Desktop/testingSet/testnum/")
files <- list.files(path=path, pattern="*.txt", full.names=TRUE) 
lines <- sapply(files, scan, simplify=FALSE)

Then write them out while ignoring/discarding the output:

ign <- lapply(files, function(fn) writeLines(fn, paste0(fn, ".txt")))

That last line can be even shorter

ign <- Map(writeLines, lines, paste0(files, ".txt"))

In both cases, ign is merely a throw-away variable: the return value from writeLines is NULL, so all you'll have there is a vector or list of NULLs.

Lastly, I'm assuming you are doing something meaningful to the contents of lines between reading them in and re-writing them to new files (that have an additional .txt appended, e.g., filename.txt.txt).

Upvotes: 2

Related Questions