Reputation: 405
I have the following data frame:
example.frame = data.frame("ID" = c(1,1,1,1,2,2,2,3,3,3,3)
, "AL" = c(1,1,2,4,1,3,4,1,5,1,2)
, "marker" = c(0,0,0,0,0,0,0,0,0,1,1))
What I want to achieve is that for every ID
group the marker
variable is filled under the following condition. It is 1
for all rows following an AL
of 5
or higher (in the row before) otherwise it is 0
. Does anyone has a suggestion how to solve it? I tried it with by()
but the I do not know how to formulate the condition.
Thanks in advance
Upvotes: 1
Views: 1504
Reputation: 17369
A solution with dplyr
library(dplyr)
example.frame = data.frame("ID" = c(1,1,1,1,2,2,2,3,3,3,3)
, "AL" = c(1,1,2,4,1,3,4,1,5,1,2)) %>%
group_by(ID) %>%
mutate(marker = as.numeric(cummax(lag(AL, default = 0)) >= 5))
example.frame
Upvotes: 1
Reputation: 51582
An idea via base R, which assumes that there is only 1 value >=5 in each group,
with(example.frame, ave(AL, ID, FUN = function(i)
replace(cumsum(i >= 5), i >= 5, 0)))
#[1] 0 0 0 0 0 0 0 0 0 1 1
Upvotes: 4
Reputation: 887078
We can use data.table
library(data.table)
setDT(example.frame)[, marker := +((cumsum(shift(AL >=5, fill=FALSE)))>0), ID]
example.frame
# ID AL marker
# 1: 1 1 0
# 2: 1 1 0
# 3: 1 2 0
# 4: 1 4 0
# 5: 2 1 0
# 6: 2 3 0
# 7: 2 4 0
# 8: 3 1 0
# 9: 3 5 0
#10: 3 1 1
#11: 3 2 1
Upvotes: 2
Reputation: 38500
Here is a base R solution with ave
and cummax
example.frame$marker <- ave(example.frame$AL, example.frame$ID,
FUN=function(x) cummax(x >= 5))
example.frame
ID AL marker
1 1 1 0
2 1 1 0
3 1 2 0
4 1 4 0
5 2 1 0
6 2 3 0
7 2 4 0
8 3 1 0
9 3 5 1
10 3 1 1
11 3 2 1
Or, if the goal is to start in the row after a 5 or greater is encountered, you could include c
and head
like this
ave(example.frame$AL, example.frame$ID, FUN=function(x) c(0, head(cummax(x >= 5), -1)))
[1] 0 0 0 0 0 0 0 0 0 1 1
Upvotes: 3