Reputation: 306
I want to calculate a rolling sum and rolling average of my data with the size of the rolling window defined for each row.
For example, suppose I have daily temperature and daily precipitation for different cities. I want to calculate past average temperatures, and past cumulative rain for each city, but the window of analysis change in each row. I also need to calculate past climatic variables, but skipping the first few observations.
The code below helps to give an example of my needs.
set.seed(122)
df <- data.frame(rain = rep(5,10),temp=1:10, skip = sample(0:2,10,T),
windw_sz = sample(1:2,10,T),city =c(rep("a",5),rep("b",5)),ord=rep(c(1:5),2))
df
rain temp skip windw_sz city ord
1 5 1 0 2 a 1
2 5 2 1 1 a 2
3 5 3 2 2 a 3
4 5 4 2 1 a 4
5 5 5 2 2 a 5
6 5 6 0 1 b 1
7 5 7 2 2 b 2
8 5 8 1 2 b 3
9 5 9 2 1 b 4
10 5 10 2 2 b 5
In the first line, skip== 0, and window_size ==2, so I should consider variables from today and yesterday. In the second line, skip == 1 and window size ==1, so I need to consider variables from yesterday only. In the third line, skip== 2 and window size == 2, so I should skip variable from today and yesterday, and consider only the two days before yesterday.
Any solution is appreciated, but I would especially enjoy something with data.table.
Thanks a lot for any suggestions
Upvotes: 2
Views: 430
Reputation: 6483
I think data.tables frollsum() should work here:
dd <- data.table(value = 1:10,
offset = c(0, 1, 0, 0, 2, 0, 0, 0, 0, 1),
windowsize = c(1, 1, 1, 3, 3, 2, 0, 1, 0, 2))
dd[, frollsum(value, windowsize + offset, adaptive=TRUE) - frollsum(value, offset, adaptive=TRUE)]
I could not figure out how to make it so that the rolling sum gets 'padded' with 0s if the window size extends the values... setting 'na.rm=TRUE' did not help either.
Upvotes: 4