Reputation: 135
I'm trying to create a simple function that will filter through my data frame and calculate means of either Ozone or PM while Site ID has a certain value. Data looks like this:
> dput(head(df))
structure(list(ozone = c(NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_), pm = c(NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_), site.id = c(1, 1, 1, 1, 1, 1)), row.names = c(NA,
6L), class = "data.frame")
My code is the following:
function1<-function(data, air_pollutant, site_id)
{
first_step<-subset(data, site_id)
pollution<-mean(first_step$air_pollutant, na.rm=TRUE)
pollution
}
However, when I try the following:
function1(dat_csv, ozone, 1:115)
It throws an error that
2: In mean.default(mean$air_pollutant, na.rm = TRUE): argument is not numeric or logical: returning NA
Upvotes: 2
Views: 103
Reputation: 4233
Valid points in the comments above. Also, use a character for the air pollutant when calling the function. I modified your function to make it working:
df <- data.frame(year = c(2010, 2010, 2013),
ozone = c(34,55,112),
pm = c(2,2,3),
site_id = c(1,1,2))
function1<-function(data, air_pollutant, site_id)
{
ss <- data[data$site_id %in% site_id, ]
pollution<-mean(ss[[air_pollutant]], na.rm=TRUE)
pollution
}
function1(df, "ozone", 1:115)
Upvotes: 1