curious
curious

Reputation: 51

Counter variable with conditions

I have a simple problem but somehow can't figure out the solution...

Here is an example of a data set:

dt = data.table(A=rep(c(1:2), each = 5), B = c(1,1,2,2,3,1,2,3,3,1), C =c("a","b","b","b","b","b","a","b","a","a"))

Basically, I want a counter counter variable which repeats it's value only when conditions are met. The conditions are that the subsequent rows in A should be the same, in B they should differ and in C they should be the same. This is the desired output:

dt = data.table(A=rep(c(1:2), each = 5), B = c(1,1,2,2,3,1,2,3,3,1), C =c("a","b","b","b","b","b","a","b","a","a"), counter = c(1,2,2,3,3,4,5,6,7,7))

As you can see, counter variable repeats it's value only when these conditions are met.

Thanks!

Upvotes: 1

Views: 85

Answers (2)

akrun
akrun

Reputation: 887251

Here is an option using shift from data.table

dt[, counter := cumsum(c(TRUE, !Reduce(`&`, c(Map(`==`, .SD,  
        shift(.SD)),  list(B != shift(B))))[-1])), .SDcols = c(1, 3)]

Upvotes: 0

Rui Barradas
Rui Barradas

Reputation: 76460

Use cumsum on a logical condition.

n <- nrow(dt)
dt$D <- c(1L, !c(dt$A[-n] == dt$A[-1] & dt$B[-n] != dt$B[-1] & dt$C[-n] == dt$C[-1]))
dt$D <- cumsum(dt$D)

Upvotes: 1

Related Questions