Reputation: 51
I have a simple problem but somehow can't figure out the solution...
Here is an example of a data set:
dt = data.table(A=rep(c(1:2), each = 5), B = c(1,1,2,2,3,1,2,3,3,1), C =c("a","b","b","b","b","b","a","b","a","a"))
Basically, I want a counter counter variable which repeats it's value only when conditions are met. The conditions are that the subsequent rows in A should be the same, in B they should differ and in C they should be the same. This is the desired output:
dt = data.table(A=rep(c(1:2), each = 5), B = c(1,1,2,2,3,1,2,3,3,1), C =c("a","b","b","b","b","b","a","b","a","a"), counter = c(1,2,2,3,3,4,5,6,7,7))
As you can see, counter variable repeats it's value only when these conditions are met.
Thanks!
Upvotes: 1
Views: 85
Reputation: 887251
Here is an option using shift
from data.table
dt[, counter := cumsum(c(TRUE, !Reduce(`&`, c(Map(`==`, .SD,
shift(.SD)), list(B != shift(B))))[-1])), .SDcols = c(1, 3)]
Upvotes: 0
Reputation: 76460
Use cumsum
on a logical condition.
n <- nrow(dt)
dt$D <- c(1L, !c(dt$A[-n] == dt$A[-1] & dt$B[-n] != dt$B[-1] & dt$C[-n] == dt$C[-1]))
dt$D <- cumsum(dt$D)
Upvotes: 1