Matt
Matt

Reputation: 213

Moving Variance with Aggregation

I have some 6 minute frequency thermocouple data. Thermocouples are installed at various heights and at each height there are a number thermocouple distinguished by radial position

DT_TI_RECORDED      HEIGHT      POS             TEMPERATURE
2018-05-16 00:00:00     1       90              111
2018-05-16 00:00:00     1       180             112
2018-05-16 00:00:00     1       270             113
2018-05-16 00:00:00     2       90              112
2018-05-16 00:00:00     2       180             114
2018-05-16 00:00:00     2       270             115
2018-05-16 00:00:00     3       90              112
2018-05-16 00:00:00     3       180             112
2018-05-16 00:00:00     3       270             113
...
2018-05-16 00:06:00     1       90              111
2018-05-16 00:06:00     1       180             112
2018-05-16 00:06:00     1       270             113
2018-05-16 00:06:00     2       90              112
2018-05-16 00:06:00     2       180             114
2018-05-16 00:06:00     2       270             112
2018-05-16 00:06:00     3       90              114
2018-05-16 00:06:00     3       180             112
2018-05-16 00:06:00     3       270             114
...

Every 6 minutes for each unique combination of height and position I would like to calculate a backwards n hourly moving variance Lets say 4 hourly.

The original code I am trying to duplicate is written for SAS stats package

    PROC EXPAND DATA=Raw_data
        OUT=Moving_Variance
        ALIGN = BEGINNING
    ;
    by HEIGHT POS;
    ID DT_TI_RECORDED ;
        CONVERT TEMPERATURE = Moving_4hour_Var /  METHOD = none TRANSFORMOUT = (MOVVAR 40); 
    #/* 40 obs at 6min freq = 4hour moving variance*/
    QUIT;

I have spent a couple of hours searching google I think the R library I need to use is called zoo and function I want is rollapply but I can't figure out how to combine aggregations with rollapply.

I have tried

moving_var <- Raw_data %>%
              aggregate(HEIGHT,POS) %>%
              rollapply( TEMPERATURE, width = 40, FUN = sd, fill = NA)

But does not work. I am very new to R programming and this is driving me crazy.

Upvotes: 1

Views: 141

Answers (1)

R. Schifini
R. Schifini

Reputation: 9313

Try the following aggregation:

library(zoo)

result = aggregate(temp ~ pos + height,
              data = df,
              FUN = function(x){
                  rollapply(x, width = 40, FUN = var, by = 40)
              }
)

width is the width of the rolling window, while by is how many points are skipped for the next window starting point. With 40 in each you'll get the start of each window next to the end of the previous one.

The resulting data frame has one column for each window. This structure can be considered as "wide". If you want to have it as a "long" format, use gather from tidyr or melt from reshape2.

Example:

df = structure(list(pos = c(0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 
                            180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 
                            0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 
                            270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 
                            180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 
                            0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 
                            270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 
                            180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 
                            0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 
                            270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 
                            180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 
                            0, 90, 180, 270), height = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
                            3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
                            3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
                            3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
                            3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
                            3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
                            3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
                            3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
                            3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
                            3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
                            3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), temp = 1:160), .Names = c("pos", 
                            "height", "temp"), row.names = c(NA, -160L), class = "data.frame")

> head(df,20)
   pos height temp
1    0      1    1
2   90      1    2
3  180      1    3
4  270      1    4
5    0      2    5
6   90      2    6
7  180      2    7
8  270      2    8
9    0      3    9
10  90      3   10
11 180      3   11
12 270      3   12
13   0      4   13
14  90      4   14
15 180      4   15
16 270      4   16
17   0      1   17
18  90      1   18
19 180      1   19
20 270      1   20


library(zoo)

result = aggregate(temp ~ pos + height,
              data = df,
              FUN = function(x){
                  rollapply(x, width = 3, FUN = var, by = 3)
              }
)

will result in:

   pos height temp.1 temp.2 temp.3
1    0      1    256    256    256
2   90      1    256    256    256
3  180      1    256    256    256
4  270      1    256    256    256
5    0      2    256    256    256
6   90      2    256    256    256
7  180      2    256    256    256
8  270      2    256    256    256
9    0      3    256    256    256
10  90      3    256    256    256
11 180      3    256    256    256
12 270      3    256    256    256
13   0      4    256    256    256
14  90      4    256    256    256
15 180      4    256    256    256
16 270      4    256    256    256

Upvotes: 1

Related Questions