emanuele
emanuele

Reputation: 2589

Moving variance in R

I know that the filter() function in R calculate the moving average. I would like to know if exists a function that return me the moving variance or standard deviation, in order to show it in a plot side by side with the output of filter() function.

Upvotes: 32

Views: 19104

Answers (4)

GoGonzo
GoGonzo

Reputation: 2877

runner function in runner package applies any R function on running windows. With runner one can specify simple window by setting the length k or lag. Below moving sd as suggested by OOP on 4-elements windows.

enter image description here

library(runner)

set.seed(1)
x <- rnorm(20, sd = 1)
runner(x, sd, k = 4, na_pad = TRUE)

#[1]        NA        NA        NA 1.1021597 0.9967429 1.1556947 0.9884053 0.6902835 0.7180483 0.4647160
#[11] 0.7454670 0.7489618 0.9449882 1.5821988 1.4459037 1.3889432 1.3954101 0.6193867 0.5296744 0.4266423

To apply running functions on date-windows one should specify idx. Length of idx should be the same length as x and should be of type Date or integer. Example below illustrates window of size k = 4 lagged by lag = 1. In parentheses index ranges for each window.

enter image description here

idx <- c(4, 6, 7, 13, 17, 18, 18, 21, 27, 31, 37, 42, 44, 47, 48)
runner::runner(x = 1:15, 
               k = 5,
               lag = 1,
               idx = idx,
               f = function(x) mean(x))

# [1]   NA  1.0  1.5   NA  4.0  4.5  4.5  6.0   NA  9.0   NA 11.0 12.0 12.5 13.5

More info in documentation and vignettes

Upvotes: 0

Gavin Simpson
Gavin Simpson

Reputation: 174898

Consider the zoo package. For example filter() gives:

> filter(1:100, rep(1/3,3))
Time Series:
Start = 1 
End = 100 
Frequency = 1 
  [1] NA  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
 [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
 [51] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75
 [76] 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 NA

whereas rollmean() in zoo gives:

> rollmean(1:100, k = 3, na.pad = TRUE)
  [1] NA  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
 [26] 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
 [51] 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75
 [76] 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 NA

which is the same (for a 3 point moving average in this example).

Whilst zoo doesn't have a rollsd() or rollvar() it does have rollapply(), which works like the apply() functions to apply any R function to the specified window.

> rollapply(1:100, width = 3, FUN = sd, na.pad = TRUE)
  [1] NA  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
 [26]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
 [51]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1
 [76]  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1 NA
Warning message:
In rollapply.zoo(zoo(data), ...) : na.pad argument is deprecated

or on something more interesting:

> rollapply(vec, width = 3, FUN = sd, na.pad = TRUE)
  [1]        NA 0.3655067 0.8472871 0.5660495 0.3491970 0.4732417 0.9236859
  [8] 0.8075226 1.8725851 1.1930784 0.6329325 1.1412416 0.8430772 0.5808005
 [15] 0.3838545 1.1738170 1.1655400 1.3241700 0.6876834 0.1534157 0.4858477
 [22] 0.9843506 0.6002713 0.6897541 2.0619563 2.5675788 6.3522039 6.0066864
 [29] 6.2618432 5.1704866 2.1360853 2.5602557 1.0408528 1.0316396 4.9441628
 [36] 5.0319314 5.7589716 3.2425000 4.8788158 2.0847286 4.5199291 2.5323486
 [43] 2.1987149 1.8393000 1.2278639 1.5998965 1.5341485 4.4287108 4.4159166
 [50] 4.3224546 3.6959067 4.9826264 5.3134044 8.4084322 9.1249234 7.5506725
 [57] 3.8499136 3.9680487 5.6362296 4.9124095 4.3452706 4.0227141 4.5867559
 [64] 4.7350394 4.3203807 4.4506799 7.2759499 7.6536424 7.8487654 2.0905576
 [71] 4.0056880 5.6209853 1.5551659 1.3615268 2.8469458 2.8323588 1.9848578
 [78] 1.1201124 1.4248380 1.7802571 1.4281773 2.5481935 1.8554451 1.0925410
 [85] 2.1823722 2.2788755 2.4205378 2.0733741 0.7462248 1.3873578 1.4265948
 [92] 0.7212619 0.7425993 1.0696432 2.4520585 3.0555819 3.1000885 1.0945292
 [99] 0.3726928        NA
Warning message:
In rollapply.zoo(zoo(data), ...) : na.pad argument is deprecated

You can get rid of the warning by using the fill = NA argument, as in

> rollapply(vec, width = 3, FUN = sd, fill = NA)

Upvotes: 41

Joshua Ulrich
Joshua Ulrich

Reputation: 176698

The TTR package has runSD among others:

> library(TTR)
> ls("package:TTR", pattern="run*")
 [1] "runCor"    "runCov"    "runMAD"    "runMax"    "runMean"  
 [6] "runMedian" "runMin"    "runSD"     "runSum"    "runVar"

runSD will be much faster than rollapply because it avoids making many R function calls. For example:

rzoo <- function(x,n) rollapplyr(x, n, sd, fill=NA)
rttr <- function(x,n) runSD(x, n)
library(rbenchmark)
set.seed(21)
x <- rnorm(1e4)
all.equal(rzoo(x,250), rttr(x,250))
# [1] TRUE
benchmark(rzoo(x,250), rttr(x,250))[,1:6]
#           test replications elapsed relative user.self sys.self
# 2 rttr(x, 250)          100    0.58    1.000      0.58     0.00
# 1 rzoo(x, 250)          100   54.53   94.017     53.85     0.06

Upvotes: 24

Matthew Plourde
Matthew Plourde

Reputation: 44614

rollapply in the zoo package takes an arbitrary function. It's different from filter in that it excludes the NAs by default.

That being said, though, there's not much sense in loading a package for a function that's so simple to roll yourself (pun intended).

Here's one that's right aligned:

my.rollapply <- function(vec, width, FUN) 
    sapply(seq_along(vec), 
           function(i) if (i < width) NA else FUN(vec[i:(i-width+1)]))

set.seed(1)
vec <- sample(1:50, 50)
my.rollapply(vec, 3, sd)
 [1]        NA        NA  7.094599 12.124356 16.522712 18.502252 18.193405  7.234178  8.144528
[10] 14.468356 12.489996  3.055050 20.808652 19.467922 18.009257 18.248288 15.695010  7.505553
[19] 10.066446 11.846237 17.156146  6.557439  5.291503 23.629078 22.590558 21.197484 22.810816
[28] 24.433583 19.502137 16.165808 11.503623 12.288206  9.539392 13.051181 13.527749 19.974984
[37] 19.756855 17.616280 19.347696 18.248288 15.176737  6.082763 10.000000 10.016653  4.509250
[46]  2.645751  1.527525  5.291503 10.598742  6.557439

# rollapply output for comparison
rollapply(vec, width=3, sd, fill=NA, align='right')
 [1]        NA        NA  7.094599 12.124356 16.522712 18.502252 18.193405  7.234178  8.144528
[10] 14.468356 12.489996  3.055050 20.808652 19.467922 18.009257 18.248288 15.695010  7.505553
[19] 10.066446 11.846237 17.156146  6.557439  5.291503 23.629078 22.590558 21.197484 22.810816
[28] 24.433583 19.502137 16.165808 11.503623 12.288206  9.539392 13.051181 13.527749 19.974984
[37] 19.756855 17.616280 19.347696 18.248288 15.176737  6.082763 10.000000 10.016653  4.509250
[46]  2.645751  1.527525  5.291503 10.598742  6.557439

Upvotes: 19

Related Questions