user5813583
user5813583

Reputation: 133

resetting cumsum if value goes to negative in r

ve <- c(17, -9, 9, -17, 17, -17, 11, -9, 16, -18, 17, 0, 0, -18, 17, 0, 0, -17, 14, -14, 17, -2, 0, -15, 9, -9, 17, -16, 16, -17, 17, -17, 17, -17, 17, -17, 17, -8, 7, -16, 17, -14, 14, -10, 10, -16, 16, -10, 10, -12, 12, -11, 11, -17, 17, -17, 17, -9, 8, -17, 17, -17, 17, -16, 16, -17, 17, -8, 8, -9, 9, -17, 17, -17, 17, -13, 13, -10, 7, -10, 13, -16, 17, -13, 13, -13, 13, -9, 8, -17, 17, -10, 9, -17, 17, -17, 17, -16, 16, -10, 10, -15, 15, -14, 14, -14, 15, -13, 13, -9, 9, -13, 13, -12, 12, -10, 9, -11, 12, -8, 7, -10, 10, -9, 9, -11, 11, -9, 9, -7, 7, -12, 11, -11, 12, -11, 11, -14, 14, -13, 13, -10, 10, -13, 13, -17, 17, -7, 7, -17, 17, -17, 17, -14, 14, NA)

df <- data.frame(ve = ve, calc = 0)

I need to calculate cumsum in column calc, but it needs to reset to zero and start again whenever its value goes negative.. I've tried several conditions but it's not really working...

Also, is it possible to achieve this in dplyr? I'm new to dplyr and find it somewhat difficult whenever I need to use dependent value..

Thank you for your help!

it should go as..

     ve calc
1    17    17
2    -9    8
3     9    17
4   -17    0
5    17    17
6   -17    0
7    11    11
8    -9    2
9    16    18
10  -18    0
11   17    17
12    0    17
13    0    17
14  -18    0
15   17    17

If you see row 14 and 15, with the normal cumsum it would be -1 and 16 but I want it to reset to 0 instead of -1 and continue cumsum, hence the next would be 17

Upvotes: 5

Views: 1769

Answers (5)

AnilGoyal
AnilGoyal

Reputation: 26218

base R's Reduce or purrr::accumulate are designed for these scenarios

df$calc <- Reduce(\(.x, .y) ifelse(.x + .y < 0, 0, .x + .y), df$ve, accumulate = TRUE)
df
#>      ve calc
#> 1    17   17
#> 2    -9    8
#> 3     9   17
#> 4   -17    0
#> 5    17   17
#> 6   -17    0
#> 7    11   11
#> 8    -9    2
#> 9    16   18
#> 10  -18    0
#> 11   17   17
#> 12    0   17
#> 13    0   17
#> 14  -18    0
#> 15   17   17
.
.
.

or in purrr

library(purrr)
library(dplyr)

df %>% mutate(calc = accumulate(ve,  ~ ifelse(.x + .y < 0, 0, .x + .y)))

Upvotes: 2

Ajay Ohri
Ajay Ohri

Reputation: 3492

> df$calc=ifelse(cumsum(df$ve)<0,0,cumsum(df$ve))

Upvotes: -1

AK88
AK88

Reputation: 3026

Not using dplyr, but this should work:

ve = as.data.frame(ve)
ve = na.omit(ve)
ve$cumS = 0
ve$cumS[1] = ve$ve[1]

for (i in 2 : length(ve$ve)) {

ve$cumS[i] = ifelse((ve$cumS[i - 1] + ve$ve[i]) < 0,
                     0, (ve$cumS[i - 1] + ve$ve[i]))
}

Upvotes: 1

akrun
akrun

Reputation: 886928

We can replace the NA values with 0 and use cumsum

library(dplyr)
df1 <- df %>%
      group_by(grp = cumsum(lag(cumsum(replace(ve, is.na(ve), 0)) < 0, default = TRUE))) %>%
     mutate(calc = cumsum(replace(ve, is.na(ve), 0)), calc = replace(calc, calc < 0, 0)) %>%
      ungroup() %>%
      select(-grp)
head(df1, 15)
# A tibble: 15 x 2
#      ve  calc
#   <dbl> <dbl>
# 1    17    17
# 2    -9     8
# 3     9    17
# 4   -17     0
# 5    17    17
# 6   -17     0
# 7    11    11
# 8    -9     2
# 9    16    18
#10   -18     0
#11    17    17
#12     0    17
#13     0    17
#14   -18     0
#15    17    17

Upvotes: 3

Marius
Marius

Reputation: 60060

Here is an iterative solution. I can't think of how to do this vectorized/using dplyr without multiple passes over the data, but I'm sure someone else will:

ve_csum = numeric(length(ve))

current_total = 0
for (i in 1:length(ve)) {
    if (is.na(ve[i])) {
        ve_csum[i] = current_total
        next
    }
    current_total = current_total + ve[i]
    if (current_total < 0) {
        current_total = 0
    }
    ve_csum[i] = current_total
}

result = data.frame(ve, ve_csum)

Upvotes: 1

Related Questions