polarsandwich
polarsandwich

Reputation: 135

Subset while creating a function

I'm trying to create a simple function that will filter through my data frame and calculate means of either Ozone or PM while Site ID has a certain value. Data looks like this:

> dput(head(df))
structure(list(ozone = c(NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_), pm = c(NA_real_, NA_real_, NA_real_, NA_real_, 
NA_real_, NA_real_), site.id = c(1, 1, 1, 1, 1, 1)), row.names = c(NA, 
6L), class = "data.frame")

My code is the following:

 function1<-function(data, air_pollutant, site_id) 
  {
  first_step<-subset(data, site_id)
  pollution<-mean(first_step$air_pollutant, na.rm=TRUE)
  pollution
  }

However, when I try the following:

function1(dat_csv, ozone, 1:115) 

It throws an error that

2: In mean.default(mean$air_pollutant, na.rm = TRUE): 
    argument is not numeric or logical: returning NA

Upvotes: 2

Views: 103

Answers (1)

slava-kohut
slava-kohut

Reputation: 4233

Valid points in the comments above. Also, use a character for the air pollutant when calling the function. I modified your function to make it working:

df <- data.frame(year = c(2010, 2010, 2013),
           ozone = c(34,55,112),
           pm = c(2,2,3),
           site_id = c(1,1,2))

function1<-function(data, air_pollutant, site_id) 
{
  ss <- data[data$site_id %in% site_id, ]
  pollution<-mean(ss[[air_pollutant]], na.rm=TRUE)
  pollution
}

function1(df, "ozone", 1:115)

Upvotes: 1

Related Questions