Get Max Value per Run or Series in Sequence

Question

I am trying to get a a max value per stretch of an indicator, or repeating value.

Here is an example:

A = c(28, 20, 23, 30, 26, 23, 25, 26, 27, 25, 30, 26, 25, 22, 24, 25, 24, 27, 29)
B = c(0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 1)


df <- as.data.frame(cbind(A, B))
df
A B
28 0
20 1
23 1
30 0
26 0
23 1
25 1
26 1
27 0
25 0
30 1
26 1
25 1
22 0
24 1
25 0
24 0
27 0
29 1

For each group or stretch of 1's in column B I want to find the max in column A. The max column could be an indicator that A it is a max or the actual value in A, and be NA or 0 for other values of B.

The output I am hoping for looks something like this:

I've tried to generate groups per section of column B that = 1 but I did not get very far because most grouping functions require unique values between groups.

Also, please let me know if there are any improvements to the title for this problem.

akrun · Accepted Answer

One option would be data.table

library(data.table)
setDT(df)[, Max := +((A== max(A)) & B), rleid(B) ]
df
#     A B Max
# 1: 28 0   0
# 2: 20 1   0
# 3: 23 1   1
# 4: 30 0   0
# 5: 26 0   0
# 6: 23 1   0
# 7: 25 1   0
# 8: 26 1   1
# 9: 27 0   0
#10: 25 0   0
#11: 30 1   1
#12: 26 1   0
#13: 25 1   0
#14: 22 0   0
#15: 24 1   1
#16: 25 0   0
#17: 24 0   0
#18: 27 0   0
#19: 29 1   1

Or as @Frank mentioned, for better efficiency, we can make use gmax by first assigning column and then replace

DT[, MA := max(A), by=rleid(B)][A == MA & B, Max := 1L][]

Get Max Value per Run or Series in Sequence

Answers (2)

Related Questions