pineapple
pineapple

Reputation: 169

How can I calculate a self- defined moving average (and other formula) of my data?

I have this dataframe:

> df

idx count  
1    2
2    3
3    6
4    1
5    8
6    3
7    9
8    10
9    20
10   3
11   4
12   7
13   1
14   9
15   6
16   2
17   0
18   3
19   4
20   6

What I want to do:

Using the df to explain this:

> df_new

idx   count    mvave     incline     sd    incline-mean       sd-mean
1      2        2          2          2         -3.35          -4.47
2      3        3.67       1.67       0.88      -3.68
3      6        3.33       -0.33      0.88       ...
4      1        5          1.67       0.84
5      8        4          -1         1.35
6      3        6.67       2.67       1.76
7      9        7.33       0.67       3.48
8      10       13         5.67       2.87
9      20       11         -2         2
10     3        9          -2         3.24
11     4        4.67       -4.33      2.71
12     7        4          -0.67      0.84
13     1        5.67       1.67       0.88
14     9        5.33       -0.33      0.19
15     6        5.67       0.33       1.64
16     2        2.67       -3         2.08
17     0        1.67       -1         0.51
18     3        2.33       0.67       1.39
19     4        4.33       2          1.02
20     6        6          1.67       0.67

Example calculations:

mvave:      1) (2+3+6)/3 = 3.67
            2) (3+6+1)/3 = 3.33
            ....
incline:    1) (3.67-2)/1 = 1.67
            2) (3.33-3.67) = -0.33
            ....
stdev:      1) sd(2+3.67+3.33) = 0.88
            2) sd(3.67+3.33+5) = 0.88
                ...
incline-mean:   1.67 - mean(df$count) = -3.35 
                ...
sd-mean:        0.88 - mean(df$count) = -4.47
                ...

I know how to calculate all of these values, however doing iterations and reset the x parameters is where I fail.

I know there is a rollmean function, however I would like to do this and the other calculations step-by-step.

Upvotes: 2

Views: 59

Answers (1)

akrun
akrun

Reputation: 887108

One option would be rollapply/rollmean

library(zoo)
library(dplyr)
df %>%
   mutate(mvave = rollmean(count, 3, fill = TRUE), 
          mvave = ifelse(row_number() %in% c(1, n()), count, mvave),
          incline = c(first(mvave), diff(mvave)),
          sd = rollapply(mvave, 3, FUN = sd, fill = TRUE), 
          incline_mean = incline - mean(count), 
          sd_mean = sd - mean(count))
#  idx count     mvave    incline        sd incline_mean   sd_mean
#1    1     2  2.000000  2.0000000 1.0000000   -3.3500000 -4.350000
#2    2     3  3.666667  1.6666667 0.8819171   -3.6833333 -4.468083
##3    3     6  3.333333 -0.3333333 0.8819171   -5.6833333 -4.468083
#4    4     1  5.000000  1.6666667 0.8388705   -3.6833333 -4.511130
#5    5     8  4.000000 -1.0000000 1.3471506   -6.3500000 -4.002849
#6    6     3  6.666667  2.6666667 1.7638342   -2.6833333 -3.586166
#7    7     9  7.333333  0.6666667 3.4801022   -4.6833333 -1.869898
#8    8    10 13.000000  5.6666667 2.8738927    0.3166667 -2.476107
#9    9    20 11.000000 -2.0000000 2.0000000   -7.3500000 -3.350000
#10  10     3  9.000000 -2.0000000 3.2375116   -7.3500000 -2.112488
#11  11     4  4.666667 -4.3333333 2.7148426   -9.6833333 -2.635157
#12  12     7  4.000000 -0.6666667 0.8388705   -6.0166667 -4.511130
#13  13     1  5.666667  1.6666667 0.8819171   -3.6833333 -4.468083
#14  14     9  5.333333 -0.3333333 0.1924501   -5.6833333 -5.157550
#15  15     6  5.666667  0.3333333 1.6442943   -5.0166667 -3.705706
#16  16     2  2.666667 -3.0000000 2.0816660   -8.3500000 -3.268334
#17  17     0  1.666667 -1.0000000 0.5091751   -6.3500000 -4.840825
#18  18     3  2.333333  0.6666667 1.3877773   -4.6833333 -3.962223
#19  19     4  4.333333  2.0000000 1.8358568   -3.3500000 -3.514143
#20  20     6  6.000000  1.6666667 1.0000000   -3.6833333 -4.350000

data

df <- structure(list(idx = 1:20, count = c(2L, 3L, 6L, 1L, 8L, 3L, 
 9L, 10L, 20L, 3L, 4L, 7L, 1L, 9L, 6L, 2L, 0L, 3L, 4L, 6L)),
 class = "data.frame", row.names = c(NA, 
  -20L))

Upvotes: 1

Related Questions