SriniShine
SriniShine

Reputation: 1139

Re-writing a for-loop as a lapply function in R

I have several files that contain a series of numbers. I want to find out what are the common numbers in all the files. e.g.

a.txt
1
2
3
4

b.txt
2
4
9

c.txt
2
3
4
8
10

Output: 2, 4

The code I wrote using a for loops gives me the correct result.

fileList = c("a.txt", "b.txt", "c.txt")

for(i in 1:length(fileList)){

  tempDF = read.table(fileList[1], header = T, stringsAsFactors = F)

  if(i == 1){

    commons = tempDF$x

  }else{
    commons = intersect(commons, tempDF$x)
  }

}

print(commons)

However I have some trouble re-writing it using a lapply function. How does lapply keep the value of "commons" variables without replacing?

lapply(fileList, function(x) getCommons(x))

getCommons <- function(file){

  fileData = read.table(file, header = T, stringAsFactor = F)

  commons = intersect(commons, fileData)

}

Upvotes: 2

Views: 98

Answers (1)

Rich Scriven
Rich Scriven

Reputation: 99331

You could make good use of Reduce here. And since in each file you have a single column that is not necessarily a data frame (no column name), we can replace read.table with scan. This will produce a list of three numeric vectors, making it easier and faster to find the intersection.

Reduce(intersect, lapply(files, scan, quiet = TRUE))
# [1] 2 4

Data creation:

write(1:4, file = "a.txt", sep = "\n")
write(c(1, 2, 4, 9), file = "b.txt", sep = "\n")
write(c(2, 3, 4, 8, 10), file = "c.txt", sep = "\n")
files <- c("a.txt", "b.txt", "c.txt") 

Upvotes: 3

Related Questions