For loop works for first 10 files, but get error then

Question

I tried to write a function:

pollutantmean<-function(directory,pollutant,id=1:332){
  meanpollut<-matrix(nrow = length(id),ncol = 1)
  for (i in id) {
    data<-read.csv(dir()[i])
    test<-data[,pollutant]
    meanpollut[i,]<-mean(test,na.rm = TRUE)
  }
  m<-mean(meanpollut)
  m
}

I have 332 csv files in a single directory"specdata", and I succeed to run the function for first 10 files and it works:

source("pollutantmean.R")
> pollutantmean("specdata", "sulfate", 1:10)

But when I tried to run this

pollutantmean("specdata", "nitrate", 70:72)

I get this error:

Error in `[<-`(`*tmp*`, i, , value = mean(test, na.rm = TRUE)) : 
  subscript out of bounds

So I tried to run for loop for 70th, 71st, 72nd file one by one, and I succeeded to get answer.

data<-read.csv(dir()[70])
> test<-data[,"nitrate"]
> mean(test,na.rm = TRUE)

But when I tried to use for loop and add meanpollut[i,]<-mean(test,na.rm = TRUE), the error appears again. Could someone give me some advice? Thank you.

akrun · Accepted Answer

We can use seq_along instead of i in id.

pollutantmean<-function(directory,pollutant,id=1:332){
  meanpollut<-matrix(nrow = length(id),ncol = 1)
  for (i in seq_along(id)) {
    data<-read.csv(dir()[id[i]])
    test<-data[,pollutant]
    meanpollut[i,]<-mean(test,na.rm = TRUE)
  }
  m<-mean(meanpollut)
  m
}

In the first case that works, the 'id' is from 1 to 10, and the matrix indexing works for that because there are 10 rows. In the second case, we have 3 rows (70:72), but the indexing with 'id' is searching for the 70th to 72nd rows which doesn't exist

For loop works for first 10 files, but get error then

Answers (1)

Related Questions