Hagen Brenner
Hagen Brenner

Reputation: 813

Calculate cumulative sums of certain values

Assume you have a data frame like this:

df <- data.frame(Nums = c(1,2,3,4,5,6,7,8,9,10), Cum.sums = NA)
> df
   Nums Cum.sums
1     1       NA
2     2       NA
3     3       NA
4     4       NA
5     5       NA
6     6       NA
7     7       NA
8     8       NA
9     9       NA
10   10       NA

and you want an output like this:

   Nums Cum.sums
1     1        0
2     2        0
3     3        0
4     4        3
5     5        5
6     6        7
7     7        9
8     8       11
9     9       13
10   10       15

The 4. element of the column Cum.sum is the sum of 1 and 2, the 5. element of the Column Cum.sum is the sum of 2 and 3 and so on... This means, I would like to build the cumulative sum of the first row and save it in the second row. However I don't want the normal cumulative sum but the sum of the element 2 rows above the current row plus the element 3 rows above the current row.

I allready tried to play a little bit around with the sum and cumsum function but I failed.

Any ideas?

Thanks!

Upvotes: 5

Views: 3534

Answers (3)

Tomas
Tomas

Reputation: 59585

Another solution, elegant and general, using matrix multiplication - and so very inefficient for large data. So it's not much practical, though a nice excercise:

len <- nrow(df)
sr <- 2 # number of rows to sum
lag <- 3 
mat <- matrix(
           head(c(
                 rep(0, lag * len), 
                 rep(rep(1:0, c(sr, len - sr + 1)), len)
               ), len * len), 
           nrow = 10, byrow = TRUE
       )
mat %*% df$Nums

Upvotes: 0

Tomas
Tomas

Reputation: 59585

You don't need any special function, just use normal vector operations (these solutions are all equivalent):

df$Cum.sums[-(1:3)] <- head(df$Nums, -3) + head(df$Nums[-1], -2)

or

with(df, Cum.sums[-(1:3)] <- head(Nums, -3) + head(Nums[-1], -2))

or

df$Cum.sums[-(1:3)] <- df$Nums[1:(nrow(df)-3)] + df$Nums[2:(nrow(df)-2)]

I believe the first 3 sums SHOULD be NA, not 0, but if you prefer zeroes, you can initialize the sums first:

df$Cum.sums <- 0

Upvotes: 0

Joshua Ulrich
Joshua Ulrich

Reputation: 176728

You could use the embed function to create the appropriate lags, rowSums to sum, then lag appropriately (I used head).

df$Cum.sums[-(1:3)] <- head(rowSums(embed(df$Nums,2)),-2)

Upvotes: 3

Related Questions