Reputation: 726
I am trying to calculate the moving average for each name
in the dataset below at 1, 2 , and 3 days. Here is the example of what my dataset looks like (the dput example was a little long for this question):
Date_Time_GMT_3 name value
<dttm> <chr> <dbl>
1 2021-07-08 07:15:00 20817729_2PT_Stationary 21.5
2 2021-07-08 07:15:00 20822229_4PT_Stationary 17.9
3 2021-07-08 07:15:00 20819744_6PT_AIR 17.9
4 2021-07-08 08:30:00 20817729_2PT_Stationary 18.6
5 2021-07-08 08:30:00 20822229_4PT_Stationary 16.5
6 2021-07-08 08:30:00 20819744_6PT_AIR 19.0
7 2021-07-09 10:45:00 20817729_2PT_Stationary 21.5
8 2021-07-09 10:45:00 20822229_4PT_Stationary 17.9
9 2021-07-09 10:45:00 20819744_6PT_AIR 17.9
10 2021-07-11 09:15:00 20817729_2PT_Stationary 18.6
11 2021-07-11 09:15:00 20822229_4PT_Stationary 16.5
12 2021-07-11 09:15:00 20819744_6PT_AIR 19.0
13 2021-07-12 15:45:00 20817729_2PT_Stationary 21.5
14 2021-07-12 15:45:00 20822229_4PT_Stationary 17.9
15 2021-07-12 15:45:00 20819744_6PT_AIR 17.9
16 2021-07-13 06:15:00 20817729_2PT_Stationary 18.6
17 2021-07-13 06:15:00 20822229_4PT_Stationary 16.5
18 2021-07-13 06:15:00 20819744_6PT_AIR 19.0
I have tried using this code
some_data = df %>%
dplyr::group_by(name) %>%
dplyr::mutate(mean_01da = zoo::rollmean(value, k = 1, fill = NA),
mean_02da = zoo::rollmean(value, k = 2, fill = NA),
mean_03da = zoo::rollmean(value, k = 3, fill = NA)) %>%
dplyr::ungroup()
but the moving average isn't over the time specified - I'm thinking because k
should be the number of rows I want to use for values. Is there a way to calculate the moving average by specifying the number of days to calculate over (i.e. 1, 2, and 3 days) using the date/time column?
Upvotes: 0
Views: 639
Reputation: 270055
Use an expression involving findInterval to find how many rows back to use and note that in rollapplyr we can specify a vector for the number of rows to use. POSIXct objects use seconds internally so 24 * 3600 is the number of seconds in 24 hours, etc.
library(dplyr, exclude = c("lag", "filter"))
library(zoo)
DF %>%
group_by(name) %>%
mutate(tt = Date_Time_GMT_3,
day1 = rollapplyr(value, 1:n() - findInterval(tt - 24 * 3600, tt), mean),
day2 = rollapplyr(value, 1:n() - findInterval(tt - 48 * 3600, tt), mean),
day3 = rollapplyr(value, 1:n() - findInterval(tt - 72 * 3600, tt), mean)) %>%
ungroup %>%
select(-tt)
The input in reproducible form:
DF <-
structure(list(Date_Time_GMT_3 = structure(c(1625742900, 1625742900,
1625742900, 1625747400, 1625747400, 1625747400, 1625841900, 1625841900,
1625841900, 1626009300, 1626009300, 1626009300, 1626119100, 1626119100,
1626119100, 1626171300, 1626171300, 1626171300), class = c("POSIXct",
"POSIXt"), tzone = ""), name = c("20817729_2PT_Stationary", "20822229_4PT_Stationary",
"20819744_6PT_AIR", "20817729_2PT_Stationary", "20822229_4PT_Stationary",
"20819744_6PT_AIR", "20817729_2PT_Stationary", "20822229_4PT_Stationary",
"20819744_6PT_AIR", "20817729_2PT_Stationary", "20822229_4PT_Stationary",
"20819744_6PT_AIR", "20817729_2PT_Stationary", "20822229_4PT_Stationary",
"20819744_6PT_AIR", "20817729_2PT_Stationary", "20822229_4PT_Stationary",
"20819744_6PT_AIR"), value = c(21.5, 17.9, 17.9, 18.6, 16.5,
19, 21.5, 17.9, 17.9, 18.6, 16.5, 19, 21.5, 17.9, 17.9, 18.6,
16.5, 19)), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
"16", "17", "18"))
Upvotes: 1