Reputation: 75
I am trying to find the cumulative sum of x=c(1,2,3,4,5)
based on moving window of y=c(1,1,2,3,2)
. I tried the following code based on rollapply in zoo package:
rollapply(c(1,2,3,4,5),c(1,1,2,3,2),cumsum,align="right",partial=T)
My expected answer is 1,2,5,9,9 (because I am cumulating from the left side) but instead I get the below 4x3 table:
How can I correct the code to get the desired result?
Upvotes: 1
Views: 86
Reputation: 16981
Vectorized using cumsum
and diff
:
c(
diff(
matrix(
c(0, cumsum(x))[c(seq_along(y) - y, seq_along(y)) + 1], 2, length(y), 1
)
)
)
#> [1] 1 2 5 9 9
Benchmarking:
set.seed(0)
x <- seq.int(1e4)
y <- ceiling(x*runif(x))
library(zoo)
microbenchmark::microbenchmark(
rollapply = rollapply(x, y, sum, align = "right", partial = T),
mapply = mapply(\(a, b) sum(x[b + 1 - a:1]), y, seq_along(y)),
cumsum = c(diff(matrix(c(0, cumsum(x))[c(seq_along(y) - y, seq_along(y)) + 1], 2, length(y), 1))),
check = "equal",
times = 10,
unit = "relative"
)
#> Unit: relative
#> expr min lq mean median uq max neval
#> rollapply 2170.9093 2068.8286 1692.3103 1505.4102 1557.881 1542.7317 10
#> mapply 741.6908 699.2924 632.0529 511.8272 643.252 666.9789 10
#> cumsum 1.0000 1.0000 1.0000 1.0000 1.000 1.0000 10
Upvotes: 0
Reputation: 101335
I think you should use sum
instead of cumsum
> zoo::rollapply(c(1, 2, 3, 4, 5), c(1, 1, 2, 3, 2), sum, align = "right", partial = T)
[1] 1 2 5 9 9
and you can use mapply
as well to achieve the same output
mapply(\(a, b) sum(x[b + 1 - a:1]), y, seq_along(y))
set.seed(0)
x <- seq.int(1e4)
y <- sort(sample(1:10, length(x), TRUE))
library(zoo)
microbenchmark(
rollapply = rollapply(x, y, sum, align = "right", partial = T),
mapply = mapply(\(a, b) sum(x[b + 1 - a:1]), y, seq_along(y)),
check = "identical",
times = 10L,
unit = "relative"
)
shows rollapply
is not efficient as mapply
Unit: relative
expr min lq mean median uq max neval
rollapply 9.362684 9.32441 9.340693 9.31007 9.985161 9.342603 10
mapply 1.000000 1.00000 1.000000 1.00000 1.000000 1.000000 10
Upvotes: 3