Zhubarb
Zhubarb

Reputation: 11895

Incrementation with multiple conditions

This question is an add-on to this one. I thought the additional condition was substantial enough to create a new question.

I have a dataframe of dates and events as below.

data <- data.frame( date= as.Date(c(rep("24.07.2012",12), rep("25.07.2012",8)), format="%d.%m.%Y"), 
                    event=rep(0,20)   )
data$event[10] <- 1
data$event[15] <- 1

I want to add a start counter column that increments in 10's and reset:

So the desired output with this additional start column would be:

         date   event start
1  2012-07-24     0     0
2  2012-07-24     0    10
3  2012-07-24     0    20
4  2012-07-24     0    30
5  2012-07-24     0    40
6  2012-07-24     0    50
7  2012-07-24     0    60
8  2012-07-24     0    70
9  2012-07-24     0    80
10 2012-07-24     1    90
11 2012-07-24     0     0
12 2012-07-24     0    10
13 2012-07-25     0     0
14 2012-07-25     0    10
15 2012-07-25     1    20
16 2012-07-25     0     0
17 2012-07-25     0    10
18 2012-07-25     0    20
19 2012-07-25     0    30
20 2012-07-25     0    40

The linked question has very good solutions that only cater for condition 1.

With the addition of condition 2, we now need to keep track of the date value for the (n-1)th row. So I am guessing this complicates the solution.

Any ideas to tackle this without a for loop?

Upvotes: 1

Views: 109

Answers (2)

Roland
Roland

Reputation: 132706

library(data.table)
setDT(data)
data[, start := 10 * (seq_along(event) - 1), 
     by=list(date, cumsum(c(1L, diff(event) == -1L)))]
#turn the data.table into a data.frame:
class(data) <- "data.frame"

As @David Arenburg reminds me, there is setDF now that does essentially the same and some additional clean-up.

data
#         date event start
#1  2012-07-24     0     0
#2  2012-07-24     0    10
#3  2012-07-24     0    20
#4  2012-07-24     0    30
#5  2012-07-24     0    40
#6  2012-07-24     0    50
#7  2012-07-24     0    60
#8  2012-07-24     0    70
#9  2012-07-24     0    80
#10 2012-07-24     1    90
#11 2012-07-24     0     0
#12 2012-07-24     0    10
#13 2012-07-25     0     0
#14 2012-07-25     0    10
#15 2012-07-25     1    20
#16 2012-07-25     0     0
#17 2012-07-25     0    10
#18 2012-07-25     0    20
#19 2012-07-25     0    30
#20 2012-07-25     0    40

Upvotes: 4

Zhubarb
Zhubarb

Reputation: 11895

For reference, this is the ugly for loop answer, that I want to avoid:

# for loop solution
data$start_loop <- rep (0, nrow(data))
for (r in 2:nrow(data)) {
  if( (data$event[r-1] == 1) | (data$date[r] != data$date[r-1] ) ){
    data$start_loop[r] = 0
  }
  else {
    data$start_loop[r] <- data$start_loop[r-1] + 10
  }
}

Upvotes: 0

Related Questions