burton030
burton030

Reputation: 405

Using ifelse function by group

I have the following data frame:

example.frame = data.frame("ID" = c(1,1,1,1,2,2,2,3,3,3,3)
                           , "AL" = c(1,1,2,4,1,3,4,1,5,1,2)
                           , "marker" = c(0,0,0,0,0,0,0,0,0,1,1))

What I want to achieve is that for every ID group the marker variable is filled under the following condition. It is 1 for all rows following an AL of 5 or higher (in the row before) otherwise it is 0. Does anyone has a suggestion how to solve it? I tried it with by() but the I do not know how to formulate the condition.

Thanks in advance

Upvotes: 1

Views: 1504

Answers (4)

Benjamin
Benjamin

Reputation: 17369

A solution with dplyr

library(dplyr)
example.frame = data.frame("ID" = c(1,1,1,1,2,2,2,3,3,3,3)
                           , "AL" = c(1,1,2,4,1,3,4,1,5,1,2)) %>%
  group_by(ID) %>%
  mutate(marker = as.numeric(cummax(lag(AL, default = 0)) >= 5))

example.frame

Upvotes: 1

Sotos
Sotos

Reputation: 51582

An idea via base R, which assumes that there is only 1 value >=5 in each group,

with(example.frame, ave(AL, ID, FUN = function(i)
                                     replace(cumsum(i >= 5), i >= 5, 0)))

#[1] 0 0 0 0 0 0 0 0 0 1 1

Upvotes: 4

akrun
akrun

Reputation: 887078

We can use data.table

library(data.table)
setDT(example.frame)[, marker := +((cumsum(shift(AL >=5, fill=FALSE)))>0), ID]
example.frame
#    ID AL marker
# 1:  1  1      0
# 2:  1  1      0
# 3:  1  2      0
# 4:  1  4      0
# 5:  2  1      0
# 6:  2  3      0
# 7:  2  4      0
# 8:  3  1      0
# 9:  3  5      0
#10:  3  1      1
#11:  3  2      1

Upvotes: 2

lmo
lmo

Reputation: 38500

Here is a base R solution with ave and cummax

example.frame$marker <- ave(example.frame$AL, example.frame$ID,
                            FUN=function(x) cummax(x >= 5))

example.frame
   ID AL marker
1   1  1      0
2   1  1      0
3   1  2      0
4   1  4      0
5   2  1      0
6   2  3      0
7   2  4      0
8   3  1      0
9   3  5      1
10  3  1      1
11  3  2      1

Or, if the goal is to start in the row after a 5 or greater is encountered, you could include c and head like this

ave(example.frame$AL, example.frame$ID, FUN=function(x) c(0, head(cummax(x >= 5), -1)))
[1] 0 0 0 0 0 0 0 0 0 1 1

Upvotes: 3

Related Questions