Reputation: 11895
This question is an add-on to this one. I thought the additional condition was substantial enough to create a new question.
I have a dataframe of date
s and event
s as below.
data <- data.frame( date= as.Date(c(rep("24.07.2012",12), rep("25.07.2012",8)), format="%d.%m.%Y"),
event=rep(0,20) )
data$event[10] <- 1
data$event[15] <- 1
I want to add a start
counter column that increments in 10's and reset:
So the desired output with this additional start
column would be:
date event start
1 2012-07-24 0 0
2 2012-07-24 0 10
3 2012-07-24 0 20
4 2012-07-24 0 30
5 2012-07-24 0 40
6 2012-07-24 0 50
7 2012-07-24 0 60
8 2012-07-24 0 70
9 2012-07-24 0 80
10 2012-07-24 1 90
11 2012-07-24 0 0
12 2012-07-24 0 10
13 2012-07-25 0 0
14 2012-07-25 0 10
15 2012-07-25 1 20
16 2012-07-25 0 0
17 2012-07-25 0 10
18 2012-07-25 0 20
19 2012-07-25 0 30
20 2012-07-25 0 40
The linked question has very good solutions that only cater for condition 1.
With the addition of condition 2, we now need to keep track of the date
value for the (n-1)th
row. So I am guessing this complicates the solution.
Any ideas to tackle this without a for
loop?
Upvotes: 1
Views: 109
Reputation: 132706
library(data.table)
setDT(data)
data[, start := 10 * (seq_along(event) - 1),
by=list(date, cumsum(c(1L, diff(event) == -1L)))]
#turn the data.table into a data.frame:
class(data) <- "data.frame"
As @David Arenburg reminds me, there is setDF
now that does essentially the same and some additional clean-up.
data
# date event start
#1 2012-07-24 0 0
#2 2012-07-24 0 10
#3 2012-07-24 0 20
#4 2012-07-24 0 30
#5 2012-07-24 0 40
#6 2012-07-24 0 50
#7 2012-07-24 0 60
#8 2012-07-24 0 70
#9 2012-07-24 0 80
#10 2012-07-24 1 90
#11 2012-07-24 0 0
#12 2012-07-24 0 10
#13 2012-07-25 0 0
#14 2012-07-25 0 10
#15 2012-07-25 1 20
#16 2012-07-25 0 0
#17 2012-07-25 0 10
#18 2012-07-25 0 20
#19 2012-07-25 0 30
#20 2012-07-25 0 40
Upvotes: 4
Reputation: 11895
For reference, this is the ugly for
loop answer, that I want to avoid:
# for loop solution
data$start_loop <- rep (0, nrow(data))
for (r in 2:nrow(data)) {
if( (data$event[r-1] == 1) | (data$date[r] != data$date[r-1] ) ){
data$start_loop[r] = 0
}
else {
data$start_loop[r] <- data$start_loop[r-1] + 10
}
}
Upvotes: 0