cs0815
cs0815

Reputation: 17388

rollmean from zoo package returns unexpected results

I am using this code:

library(dplyr)
library(lubridate)
library(zoo)

temp <- data.frame(
        date = as.Date(c("2015-01-01", "2015-02-01", "2015-03-01", "2015-04-01", "2015-05-01", "2015-06-01", "2015-07-01", "2015-08-01", "2015-09-01", "2015-10-01", "2015-11-01", "2015-12-01"))
        , value = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
    ) %>%
    arrange(
        date
    ) %>%
    mutate(
        value_rollmean = rollmean(value, k = 2, fill = NA)
    ) 

temp

It bizarrely returns:

         date value value_rollmean
1  2015-12-01    12             NA
2  2015-11-01    11           11.5
3  2015-10-01    10           10.5
4  2015-09-01     9            9.5
5  2015-08-01     8            8.5
6  2015-07-01     7            7.5
7  2015-06-01     6            6.5
8  2015-05-01     5            5.5
9  2015-04-01     4            4.5
10 2015-03-01     3            3.5
11 2015-02-01     2            2.5
12 2015-01-01     1            1.5 

Why is the last entry 1st of December 2015 NA and not the first entry 1st of January 2015?

Expected output:

         date value value_rollmean
1  2015-01-01     1             NA
2  2015-02-01     2             NA
3  2015-03-01     3            1.5
4  2015-04-01     4            2.5
5  2015-05-01     5            3.5
6  2015-06-01     6            4.5
7  2015-07-01     7            5.5
8  2015-08-01     8            6.5
9  2015-09-01     9            7.5
10 2015-10-01    10            8.5
11 2015-11-01    11            9.5
12 2015-12-01    12           10.5

Upvotes: 0

Views: 601

Answers (2)

lroha
lroha

Reputation: 34406

I'm happy to be corrected but for this case I think you need to use rollapply() to take advantage of the width argument which doesn't seem to be available in the specific convenience functions. Widths passed as a list are treated as offsets, so you can do:

library(zoo)
library(dplyr)

dat %>%
  mutate(value_rollmean = rollapply(value, width = list(-(2:1)), mean, fill = NA)) 

         date value value_rollmean
1  2015-01-01     1             NA
2  2015-02-01     2             NA
3  2015-03-01     3            1.5
4  2015-04-01     4            2.5
5  2015-05-01     5            3.5
6  2015-06-01     6            4.5
7  2015-07-01     7            5.5
8  2015-08-01     8            6.5
9  2015-09-01     9            7.5
10 2015-10-01    10            8.5
11 2015-11-01    11            9.5
12 2015-12-01    12           10.5

Data:

dat <- data.frame(
  date = as.Date(c("2015-01-01", "2015-02-01", "2015-03-01", "2015-04-01", "2015-05-01", "2015-06-01", "2015-07-01", "2015-08-01", "2015-09-01", "2015-10-01", "2015-11-01", "2015-12-01"))
  , value = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
)

Upvotes: 2

cs0815
cs0815

Reputation: 17388

It appears that I have to sort the date (why would anyone try to have a moving average in a different direction?!)

library(dplyr)
library(lubridate)
library(zoo)

temp <- data.frame(
        date = as.Date(c("2015-01-01", "2015-02-01", "2015-03-01", "2015-04-01", "2015-05-01", "2015-06-01", "2015-07-01", "2015-08-01", "2015-09-01", "2015-10-01", "2015-11-01", "2015-12-01"))
        , value = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
    ) %>%
    arrange(
        desc(date)
    ) %>%
    mutate(
        value_rollmean = rollmean(value, k = 2, fill = NA)
    ) %>%
    arrange(
        date
    )

temp

Upvotes: 1

Related Questions