dleal
dleal

Reputation: 2314

Apply a rolling function by group in r (zoo, data.table)

I am having trouble doing something fairly simple: apply a rolling function (standard deviation) by group in a data.table. My problem is that when I use a data.table with rollapply by some column, data.table recycles the observations as noted in the warning message below. I would like to get NAs for the observations that are outside of the window instead of recycling the standard deviations.

This is my approach so far using iris, and a rolling window of size 2, aligned to the right:

library(zoo)
library(data.table)

A <- iris
setDT(A)
A[,stdev := rollapply(Petal.Width, width = 2, sd, align = 'right', partial = F),by = Species]
Warning messages:
1: In `[.data.table`(A, , `:=`(stdeev, rollapply(Petal.Width, width = 2,  :
  Supplied 49 items to be assigned to group 1 of size 50 in column 'stdeev' (recycled leaving remainder of 1 items).
2: In `[.data.table`(A, , `:=`(stdeev, rollapply(Petal.Width, width = 2,  :
  Supplied 49 items to be assigned to group 2 of size 50 in column 'stdeev' (recycled leaving remainder of 1 items).
3: In `[.data.table`(A, , `:=`(stdeev, rollapply(Petal.Width, width = 2,  :
  Supplied 49 items to be assigned to group 3 of size 50 in column 'stdeev' (recycled leaving remainder of 1 items).

> A
     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species     stdeev      stdev
  1:          5.1         3.5          1.4         0.2    setosa 0.00000000 0.00000000
  2:          4.9         3.0          1.4         0.2    setosa 0.00000000 0.00000000
  3:          4.7         3.2          1.3         0.2    setosa 0.00000000 0.00000000
  4:          4.6         3.1          1.5         0.2    setosa 0.00000000 0.00000000
  5:          5.0         3.6          1.4         0.2    setosa 0.14142136 0.14142136
 ---                                                                                  
146:          6.7         3.0          5.2         2.3 virginica 0.28284271 0.28284271
147:          6.3         2.5          5.0         1.9 virginica 0.07071068 0.07071068
148:          6.5         3.0          5.2         2.0 virginica 0.21213203 0.21213203
149:          6.2         3.4          5.4         2.3 virginica 0.35355339 0.35355339
150:          5.9         3.0          5.1         1.8 virginica 0.42426407 0.42426407

Upvotes: 0

Views: 3762

Answers (1)

eipi10
eipi10

Reputation: 93821

Add fill=NA to rollapply. This will ensure that a vector of length 50 (rather than 49) is returned, with NA as the first value (since align="right"), avoiding recycling.

A[,stdev := rollapply(Petal.Width, width=2, sd, align='right', partial=F, fill=NA), by=Species]
    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species      stdev
1            5.1         3.5          1.4         0.2     setosa         NA
2            4.9         3.0          1.4         0.2     setosa 0.00000000
3            4.7         3.2          1.3         0.2     setosa 0.00000000
...
51           7.0         3.2          4.7         1.4 versicolor         NA
52           6.4         3.2          4.5         1.5 versicolor 0.07071068
53           6.9         3.1          4.9         1.5 versicolor 0.00000000
...
101          6.3         3.3          6.0         2.5  virginica         NA
102          5.8         2.7          5.1         1.9  virginica 0.42426407
103          7.1         3.0          5.9         2.1  virginica 0.14142136

Upvotes: 3

Related Questions