Lily
Lily

Reputation: 65

Cumulative Sum with Conditions in R

Say I have

a <- c(0, 22, 0, 2, 0, 0, 20, 20, 20, 0, 0)

I want to do a cumulative sum whereby I minus 5 to each value in a, and then add on the previous value.

However, I also have the condition that if a becomes less than 0, I want the cumsum to 0 and if a becomes greater than 40, for cumsum to 40.

So, I want to get

(0, 17, 12, 9, 4, 0, 15, 30, 40, 35, 30)

Can anyone help? I've been trying out a lot of things for a few hours now!

@Holger, that method doesn't always work. So if I add in a couple of extra zeros it does not come with the right solution

a <- c(0, 22, 0, 2, 0, 0, 0, 0, 20, 20, 20, 0, 0)

gives

 0 17 12  9  4  0  1  7 22 37 52 47 42

Upvotes: 4

Views: 1396

Answers (4)

lmo
lmo

Reputation: 38520

You can use Reduce to get the cumulative sum and combine this with max and min or pmin and pmax to get the bounds.

It is unclear whether you want to use the 0 and 40 in your cumulative summation or if you want bound the variable afterwards. Below, I've provided both possibilities.

Bound within the summation:

Reduce(function(x, y) min(max(x + y - 5, 0), 40), a, 0, accumulate=TRUE)
[1]  0  0 17 12  9  4  0 15 30 40 35 30

Bound after the summation

pmin(pmax(Reduce(function(x, y) x + y - 5, a, 0, accumulate=TRUE), 0), 40)
[1]  0  0 12  7  4  0  0  9 24 39 34 29

Upvotes: 1

G. Grothendieck
G. Grothendieck

Reputation: 270348

Here are some alternatives:

1) Loop Create a one line loop like this:

b <- a; for(i in seq_along(b)[-1]) b[i] <- min(40, max(0, a[i] - 5 + b[i-1]))
b
## [1]  0 17 12  9  4  0 15 30 40 35 30

2) Reduce

f <- function(b, a) min(40, max(0, a - 5 + b))
Reduce(f, a, acc = TRUE)
## [1]  0 17 12  9  4  0 15 30 40 35 30

3) recursion This recursive solution will be limited to inputs which are not too long.

rec <- function(a) {
   n <- length(a)
   if (n <= 1) a
   else {
     rec.hd <- Recall(a[-n])
     c(rec.hd, min(40, max(0, rec.hd[n-1] + a[n] - 5)))
   }
}
rec(a)
## [1]  0 17 12  9  4  0 15 30 40 35 30

Upvotes: 3

ExperimenteR
ExperimenteR

Reputation: 4473

Try

cumsum_up_low <- function(a, d=5, up=40, low=0 ){
  out = rep(0, length(a))
  out[1] = a[1]*(a[1]>=0 && a[1]<=40) + 0*(a[1]<0) + 40*(a[1] > 40)
  for(i in 2:length(a)){
    if(out[i-1] + a[i] - d > low && out[i-1] + a[i] - d < up){
      out[i] = out[i-1] + a[i] - d
    } else if(out[i-1] + a[i] - d <= low){
      out[i] = 0          
    } else out[i] = 40
  }
  out
}

cumsum_up_low(a, d=5, up=40, low=0)
# [1]  0 17 12  9  4  0 15 30 40 35 30

For long vectors

a <- sample(a, 1e6, TRUE)
system.time(cumsum_up_low(a))
#   user  system elapsed 
#   3.59    0.00    3.59 
library(compiler)
cumsum_up_low_compiled <- cmpfun(cumsum_up_low)
system.time(cumsum_up_low_compiled(a))
#   user  system elapsed 
#   0.28    0.00    0.28 

For a really long vectors

library(Rcpp)
cppFunction('
NumericVector cumsum_up_low_cpp(NumericVector a, double d, double up, double low) {
  NumericVector out(a.size());
  out[0] = a[0];
  for(int i=1; i<a.size(); i++){
    if(out[i-1] + a[i] - d > low & out[i-1] + a[i] - d < up){
      out[i] = out[i-1] + a[i] - d;
    } else if(out[i-1] + a[i] - d <= low){
      out[i] = 0;          
    } else out[i] = 40;
  }
  return out;
}')

a <- sample(a, 5e6, replace = TRUE)
system.time(cumsum_up_low_compiled(a, d=5, up=40, low=0))
#   user  system elapsed 
#   1.45    0.00    1.46 
system.time(cumsum_up_low_cpp(a, d=5, up=40, low=0))
#   user  system elapsed 
#   0.04    0.02    0.05 

Upvotes: 1

Amit Kohli
Amit Kohli

Reputation: 2950

This is definitely not the efficient way to do this, but it might be easiest to understand:

a <- c(0, 22, 0, 2, 0, 0, 20, 20, 20, 0, 0)

## Initialize another vector just like a
c <- a


## Do it easy-to-understand'ly in a for loop:
for (i in seq_along(a)){
  b <- a[i]
  if (i>1) {
    b <- b+c[i-1]
    b <- b-5
  } 
  if (b<0) b <- 0
  if (b>40) b <- 40
  c[i] <- b
  print(c[i])
}

Try to figure out each part, and if you need help, lemme know!

Upvotes: 0

Related Questions