ling
ling

Reputation: 205

For loop works for first 10 files, but get error then

I tried to write a function:

pollutantmean<-function(directory,pollutant,id=1:332){
  meanpollut<-matrix(nrow = length(id),ncol = 1)
  for (i in id) {
    data<-read.csv(dir()[i])
    test<-data[,pollutant]
    meanpollut[i,]<-mean(test,na.rm = TRUE)
  }
  m<-mean(meanpollut)
  m
}

I have 332 csv files in a single directory"specdata", and I succeed to run the function for first 10 files and it works:

source("pollutantmean.R")
> pollutantmean("specdata", "sulfate", 1:10)

But when I tried to run this

pollutantmean("specdata", "nitrate", 70:72)

I get this error:

Error in `[<-`(`*tmp*`, i, , value = mean(test, na.rm = TRUE)) : 
  subscript out of bounds

So I tried to run for loop for 70th, 71st, 72nd file one by one, and I succeeded to get answer.

data<-read.csv(dir()[70])
> test<-data[,"nitrate"]
> mean(test,na.rm = TRUE)

But when I tried to use for loop and add meanpollut[i,]<-mean(test,na.rm = TRUE), the error appears again. Could someone give me some advice? Thank you.

Upvotes: 1

Views: 42

Answers (1)

akrun
akrun

Reputation: 887158

We can use seq_along instead of i in id.

pollutantmean<-function(directory,pollutant,id=1:332){
  meanpollut<-matrix(nrow = length(id),ncol = 1)
  for (i in seq_along(id)) {
    data<-read.csv(dir()[id[i]])
    test<-data[,pollutant]
    meanpollut[i,]<-mean(test,na.rm = TRUE)
  }
  m<-mean(meanpollut)
  m
}

In the first case that works, the 'id' is from 1 to 10, and the matrix indexing works for that because there are 10 rows. In the second case, we have 3 rows (70:72), but the indexing with 'id' is searching for the 70th to 72nd rows which doesn't exist

Upvotes: 2

Related Questions