user3387899
user3387899

Reputation: 611

Count number of times a variable is repeated continuously in R

Consider following MWE:

df <- data.frame(Day=1:10, Value = c("Yes","No","Yes", "Yes", "Yes", 
                                     "No", "No", "Yes","Yes",  "No"))

 Day Value
   1   Yes
   2    No
   3   Yes
   4   Yes
   5   Yes
   6    No
   7    No
   8   Yes    
   9   Yes
  10    No

I want an extra column that counts the number of times 'Value' is is already continuously 'yes'. So when Value is 'No', the new variable should always be 0. If it is the first time 'Yes' appears after 'No', it is set to 1. If then the next observations is also yes, it should be 2. As soon as the chain of 'Yes' is intermittent, the new variable for the next 'yes' will be 1 again. So my data frame should look like this:

Day Value Count
 1   Yes  1
 2    No  0
 3   Yes  1  
 4   Yes  2
 5   Yes  3
 6    No  0
 7    No  0
 8   Yes  1
 9   Yes  2
10    No  0 

Hope someone can help me out.

Upvotes: 5

Views: 1353

Answers (2)

akrun
akrun

Reputation: 887213

We can use base R as well. We create a grouping variable ('grp') by comparing the adjacent elements of 'Value' column and cumsum the logical index. Then, this can be used in ave to create the sequence.

grp <- with(df, cumsum(c(TRUE,Value[-1L]!=Value[-length(Value)])))
df$count <- ave(seq_along(df$Value), grp, FUN=seq_along)*(df$Value=='Yes')
df$count
#[1] 1 0 1 2 3 0 0 1 2 0

Upvotes: 3

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193537

You can try using "data.table", specifically the rleid function:

Example:

library(data.table)
as.data.table(df)[, count := sequence(.N), by = rleid(Value)][Value == "No", count := 0][]
#     Day Value count
#  1:   1   Yes     1
#  2:   2    No     0
#  3:   3   Yes     1
#  4:   4   Yes     2
#  5:   5   Yes     3
#  6:   6    No     0
#  7:   7    No     0
#  8:   8   Yes     1
#  9:   9   Yes     2
# 10:  10    No     0

Upvotes: 4

Related Questions