MaddieS
MaddieS

Reputation: 71

Calculate mean using rollapply only if certain percent of data is available

I have a column of hourly data and want to use rollapply to calculate the 24-hour rolling average for every hour. My data contains NA's and I only want to calculate the rolling average if 75% of the data for one 24-hour period is available, otherwise I wish for the 24-rolling average to be considered NA.

  df %>%
        mutate(rolling_avg = rollapply(hourly_data, 24, FUN = mean ,align = "right", fill = NA ))

How can I modify the above code to accomplish this?

Upvotes: 3

Views: 258

Answers (1)

Artem Sokolov
Artem Sokolov

Reputation: 13731

Define a function to do exactly what you stated:

f <- function( v ) {
  if( sum(is.na(v)) > length(v)*0.25 ) return(NA)
  mean(v, na.rm = TRUE)
}

Then use it in place of mean:

df %>% mutate(rolling_avg = rollapply(hourly_data, 24, FUN = f, 
                                     align = "right", fill = NA ))

Upvotes: 3

Related Questions