Reputation: 1
I have a table containing dates within a single year and their corresponding average temperatures (mean_temp) for each day. My objective is to compute the average temperature for the next 30 days following each given date. However, I encountered difficulty in handling the last 30 observations.
So the data looks similar to this:
temp_data = tibble(
day_year = c(1:366),
temp = rnorm(366,5,2)
)
My initial attempt (below) resulted in either "error in temp_data$temp[1:(day_year - 335)]
: ! only 0's may be mixed with negative subscripts" or computing only one value for all last observations if i take absolute values of (day_year - 335).
temp_data |>
mutate(
temp_next_30d = if_else(
day_year < 336,
rollapply(temp, width = 30, FUN = mean, align = "right", fill = NA, partial = TRUE),
mean(temp_data$temp[day_year:366]) + mean(temp_data$temp[1:(day_year-335)])
)
)
Are there any elegant solutions to address this?
Upvotes: 0
Views: 30
Reputation: 19169
One option is to add the first 29 days to the data, calculate the rolling mean, and then remove those rows. And I think you want align='left`.
temp_data2 <- temp_data |>
bind_rows(slice(temp_data, 1:29)) |>
mutate(
temp_next_30d = zoo::rollapply(temp, width = 30, FUN = mean,
align = "left", fill = NA, partial = TRUE)) |>
slice(1:366)
temp_data2
# A tibble: 366 × 3
day_year temp temp_next_30d
<int> <dbl> <dbl>
1 1 4.30 4.51
2 2 5.31 4.47
3 3 6.04 4.37
4 4 2.36 4.31
5 5 3.83 4.44
6 6 6.04 4.53
7 7 4.48 4.50
8 8 3.38 4.55
9 9 5.04 4.71
10 10 4.30 4.72
tail(temp_data2)
# A tibble: 6 × 3
day_year temp temp_next_30d
<int> <dbl> <dbl>
1 361 4.91 4.61
2 362 4.92 4.58
3 363 4.44 4.58
4 364 4.26 4.58
5 365 5.78 4.63
6 366 8.17 4.64
Upvotes: 1