Robert Weber
Robert Weber

Reputation: 41

lag() not picking up the integer value in the previous row

I have a sample data frame with the same structure as the one I'm working with here:

df <- data.frame(cond_row = c(rep("no", 10), "yes", 
                              rep("no", 5), "yes", rep("no", 7)), 
                 count_row = 0, stringsAsFactors = FALSE)

df <- df %>% 
  mutate(count_row = ifelse(cond_row == "yes", 
                            lag(count_row) + 1, 
                            lag(count_row)))

I am trying to make it so the count_row column value has one added to it every time the condition in cond_row equals "yes" and then for it to stay that way until the condition equals "yes" again, then add one again, and so on. In this case, the count_row column should be 10 0s, 6 1s, and 7 2s. The problem is that the lag() correctly picks up the "yes" condition in the ifelse() but not the "no" condition. So the count_row column value is 1 for rows in which cond_row equals "yes" but stays as 0 when cond_row equals "no".

Upvotes: 1

Views: 113

Answers (1)

akrun
akrun

Reputation: 887118

We can use cumsum on the logical expression which will increment 1 at each instance of "yes" in the 'cond_row' and stay at that value until it hits the next "yes"

library(dplyr)
df %>% 
   mutate(count_row = cumsum(cond_row == 'yes'))
#   cond_row count_row
#1        no         0
#2        no         0
#3        no         0
#4        no         0
#5        no         0
#6        no         0
#7        no         0
#8        no         0
#9        no         0
#10       no         0
#11      yes         1
#12       no         1
#13       no         1
#14       no         1
#15       no         1
#16       no         1
#17      yes         2
#18       no         2
#19       no         2
#20       no         2
#21       no         2
#22       no         2
#23       no         2
#24       no         2

Or with base R

df$count_row <- cumsum(df$cond_row == 'yes')

Upvotes: 1

Related Questions