Reputation: 205
I tried to write a function:
pollutantmean<-function(directory,pollutant,id=1:332){
meanpollut<-matrix(nrow = length(id),ncol = 1)
for (i in id) {
data<-read.csv(dir()[i])
test<-data[,pollutant]
meanpollut[i,]<-mean(test,na.rm = TRUE)
}
m<-mean(meanpollut)
m
}
I have 332 csv files in a single directory"specdata", and I succeed to run the function for first 10 files and it works:
source("pollutantmean.R")
> pollutantmean("specdata", "sulfate", 1:10)
But when I tried to run this
pollutantmean("specdata", "nitrate", 70:72)
I get this error:
Error in `[<-`(`*tmp*`, i, , value = mean(test, na.rm = TRUE)) :
subscript out of bounds
So I tried to run for loop for 70th, 71st, 72nd file one by one, and I succeeded to get answer.
data<-read.csv(dir()[70])
> test<-data[,"nitrate"]
> mean(test,na.rm = TRUE)
But when I tried to use for loop and add meanpollut[i,]<-mean(test,na.rm = TRUE)
, the error appears again.
Could someone give me some advice? Thank you.
Upvotes: 1
Views: 42
Reputation: 887158
We can use seq_along
instead of i in id
.
pollutantmean<-function(directory,pollutant,id=1:332){
meanpollut<-matrix(nrow = length(id),ncol = 1)
for (i in seq_along(id)) {
data<-read.csv(dir()[id[i]])
test<-data[,pollutant]
meanpollut[i,]<-mean(test,na.rm = TRUE)
}
m<-mean(meanpollut)
m
}
In the first case that works, the 'id' is from 1 to 10, and the matrix
indexing works for that because there are 10 rows. In the second case, we have 3 rows (70:72), but the indexing with 'id' is searching for the 70th to 72nd rows which doesn't exist
Upvotes: 2