Continual summation of a column in R until condition is met

Question

I am doing my best to learn R, and this is my first post on this forum.

I currently have a data frame with a populated vector "x" and an unpopulated vector "counter" as follows:

x <- c(NA,1,0,0,0,0,1,1,1,1,0,1)

df <- data.frame("x" = x, "counter" = 0)

    x counter
1  NA       0
2   1       0
3   0       0
4   0       0
5   0       0
6   0       0
7   1       0
8   1       0
9   1       0
10  1       0
11  0       0
12  1       0

I am having a surprisingly difficult time trying to write code that will simply populate counter so that counter sums the cumulative, sequential 1s in x, but reverts back to zero when x is zero. Accordingly, I would like counter to calculate as follows per the above example:

    x counter
1  NA       NA
2   1       1
3   0       0
4   0       0
5   0       0
6   0       0
7   1       1
8   1       2
9   1       3
10  1       4
11  0       0
12  1       1

I have tried using lag() and ifelse(), both with and without for loops, but seem to be getting further and further away from a workable solution (while lag got me close, the figures were not calculating as expected....my ifelse and for loops eventually ended up with length 1 vectors of NA_real_, NA or 1). I have also considered cumsum - but not sure how to frame the range to just the 1s - and have searched and reviewed similar posts, for example How to add value to previous row if condition is met; however, I still cannot figure out what I would expect to be a very simple task.

Admittedly, I am at a low point in my early R learning curve and greatly appreciate any help and constructive feedback anyone from the community can provide. Thank you.

Ronak Shah · Accepted Answer

You can use :

library(dplyr)

df %>%
  group_by(x1 = cumsum(replace(x, is.na(x), 0) == 0)) %>%
  mutate(counter = (row_number() - 1) * x) %>%
  ungroup %>%
  select(-x1)

#       x counter
#      
# 1    NA      NA
# 2     1       1
# 3     0       0
# 4     0       0
# 5     0       0
# 6     0       0
# 7     1       1
# 8     1       2
# 9     1       3
#10     1       4
#11     0       0
#12     1       1

Explaining the steps -

Create a new column (x1), replace NA in x with 0 and increment the group value by 1 (using cumsum) whenever x = 0.
For each group subtract the row number with 0 and multiply it by x. This multiplication is necessary because it will help to keep counter as 0 where x = 0 and counter as NA where x is NA.

Continual summation of a column in R until condition is met

Answers (2)

Related Questions