R: why does this if loop condition stick?

Question

corr <- function(directory, threshold = 0) {
  files_full<-list.files(directory, full.names=TRUE)
  v<-vector()
  for (i in 1:10) {
    a <- (read.csv(files_full[i])) 
    b <- subset(a, (!is.na(a[,2])) & (!is.na(a[,3])))
    c <- length(b[ ,4])
    if (c > threshold) {
      d <- cor(b[ ,2],b[ ,3])
    } else {
      d <- vector(mode="numeric", length = 0)
    }
  v <- rbind(v, d)
  }
  v  
}
cr <- corr("specdata", 0)

I have a set of .csv files in a directory and want to pass them as an argument to the function above. For each file, I want to select the number of complete cases and, provided that number is greater than a threshold value set via the second function argument, I want to generate the correlation between the values held in two columns of the file (cols 2 and 3). The ultimate aim is a vector containing the value of the correlation for each file for which the threshold condition is met. If the threshold condition isn't met, I want to return a numeric vector of length 0.

The number of complete cases in the first file is 117. The function above works fine so long as the threshold is below this number. If I set the threshold at >=117 the function returns a vector of length 0. And I get the warning

In rbind(v, d) :
  number of columns of result is not a multiple of vector length (arg 2)

It seems like the condition in the if statement is getting stuck on the value of the number of complete cases in the first file, rather than looping through.

I'd be very grateful if someone could explain where I'm going wrong!

R: why does this if loop condition stick?

Answers (1)

Related Questions