Reputation: 1047
I would like to create an accumulative incremental counter that increases only when a condition is met.
DT <- data.table(id = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2),
b = c(10L, 5L, 3L, 4L, 2L, 6L, 1L, 3L, 5L, 7L))
I don't get the desired result with rleid
because when two conditions are met in consecutive rows, the increment is not performed
> DT[,count := rleid(b>=5),id]
> DT
id b count
1: 1 10 1
2: 1 5 1
3: 1 3 2
4: 1 4 2
5: 1 2 2
6: 1 6 3
7: 1 1 4
8: 2 3 1
9: 2 5 2
10: 2 7 2
The expected result is
> DT
id b count
1: 1 10 1
2: 1 5 2
3: 1 3 2
4: 1 4 2
5: 1 2 2
6: 1 6 3
7: 1 1 3
8: 2 3 1
9: 2 5 2
10: 2 7 3
Upvotes: 4
Views: 985
Reputation: 21908
We can also use accumulate
function for this purpose. Here are some notes on this solution:
accumulate
takes a two argument function as its .f
argument where .x
is the previous/ accumulated value and .y
is the current value in the sequence of values of vector b
count
as 1
thus remove the first value of b
cause we don't need it anymore and check the next value by .y
and if the condition is met it will be added by one otherwise it remains as is.library(dplyr)
library(purrr)
DT %>%
group_by(id) %>%
mutate(count = accumulate(b[-1], .init = 1,
~ if(.y >= 5) {
.x + 1
} else {
.x
}))
# A tibble: 10 x 3
# Groups: id [2]
id b count
<dbl> <int> <dbl>
1 1 10 1
2 1 5 2
3 1 3 2
4 1 4 2
5 1 2 2
6 1 6 3
7 1 1 3
8 2 3 1
9 2 5 2
10 2 7 3
Upvotes: 3
Reputation: 101129
Another data.table
option with cumsum
> DT[, count := (v <- cumsum(b >= 5)) - v[1] + 1, id][]
id b count
1: 1 10 1
2: 1 5 2
3: 1 3 2
4: 1 4 2
5: 1 2 2
6: 1 6 3
7: 1 1 3
8: 2 3 1
9: 2 5 2
10: 2 7 3
Upvotes: 3
Reputation: 886948
Here is an option with cumsum
. Grouped by 'id', get the cumulative sum of logical expression (b >= 5
). For 'id' 2, the first element that is greater than or equal to 5 is at position 2 (in the grouped position), thus the first row will be 0. Inorder to make this 1, an option is to convert it to factor
and then coerce to integer so that we get the integer storage values (R
indexing starts from 1)
DT[, count := as.integer(factor(cumsum(b >= 5))), id]
-output
DT
id b count
1: 1 10 1
2: 1 5 2
3: 1 3 2
4: 1 4 2
5: 1 2 2
6: 1 6 3
7: 1 1 3
8: 2 3 1
9: 2 5 2
10: 2 7 3
Upvotes: 6