Emma Tebbs
Emma Tebbs

Reputation: 1467

skip rows in read.table within for loop

I'm importing a number of text files using:

d <- read.table(fid, skip = 21, header = FALSE, comment.char = '/', sep = '\t', fill = TRUE)

where fid is defined within a loop, so I load a different file in each iteration. The issue I have is that the number of lines that I need to skip varies among the files. For example, in this specific file I need to skip 21, whereas in the next it will be 22. The lines that I need to skip are identified in the file by '/*'. For example:

/* DATA DESCRIPTION:
blah blah
blah blah
*/
data starts here

So for this example, I would use

d <- read.table(fid, skip = 4, header = FALSE, comment.char = '/', sep = '\t', fill = TRUE)

One option I thought of was to delete all of these lines through the terminal but there must be a way of doing this in R.

Any ideas?

Upvotes: 0

Views: 746

Answers (1)

lmo
lmo

Reputation: 38520

To follow up on the comment of @Thomas, you might try the following:

for(file in fileList) {
  # find skip line
  temp <- readLines(file)
  skipLine <- which(temp == "*/")

  # read in file
  d <- read.table(file, skip = skipLine, header = FALSE, comment.char = '/', sep = '\t', fill = TRUE)
}

If you know the maximum number of commented lines and the size of your files are fairly large, you may use this in the n argument to the readLines() function to reduce the size of the read and speed up the process.

Upvotes: 2

Related Questions