Carl
Carl

Reputation: 111

Create a couting variable in grouped data based on criteria in different column in R

Within each “Id” I would like to create a counting variable (“Variable_x” ) starting at “1” that continues until “days” is > 150. When this happens, I would like Variable_x” to change to “2” and continue until the next time “days” is > 150 where it should change to "3" and so on as shown in the example below.

# Problem 
Id <- c("A","A","A","A","A","A","B","B","B","B","B","B","B","B")
days <- c(18, 29, 4, 160, 4, 5,1, 6, 170, 4,190,45, 60,1 )

DATA_pre <- data.frame(Id, days)
DATA_pre
Id days
1   A   18
2   A   29
3   A    4
4   A  160
5   A    4
6   A    5
7   B    1
8   B    6
9   B  170
10  B    4
11  B  190
12  B   45
13  B   60
14  B    1

# Preferred Solution 
Id <- c("A","A","A","A","A","A","B","B","B","B","B","B","B","B")
days <- c(18, 29, 4, 160, 4, 5,1, 6, 170, 4,190,45, 60,1 )
Variable_x <- c("1","1", "1", "2","2","2", "1","1","2","2","3","3","3","3")

DATA_post <- data.frame(Id, days, Variable_x) 
DATA_post 
 Id days Variable_x
1   A   18          1
2   A   29          1
3   A    4          1
4   A  160          2
5   A    4          2
6   A    5          2
7   B    1          1
8   B    6          1
9   B  170          2
10  B    4          2
11  B  190          3
12  B   45          3
13  B   60          3
14  B    1          3

Upvotes: 1

Views: 45

Answers (3)

akrun
akrun

Reputation: 886938

We can also do

library(dplyr)   
DATA_pre %>%
      group_by(Id) %>% mutate(Variable_x = as.integer(factor(cumsum(days > 150))))

Upvotes: 0

Prahlad
Prahlad

Reputation: 138

data <- DATA_pre
data <- data.table(data)
data[,Flag:=ifelse(days>150,1,0)]
data[,Count:=cumsum(Flag),by="Id")]
data[,Count:=Count+1]

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388817

We can use cumsum :

library(dplyr)
DATA_pre %>% group_by(Id) %>% mutate(Variable_x = cumsum(days > 150) + 1)

#   Id     days Variable_x
#   <fct> <dbl>      <dbl>
# 1 A        18          1
# 2 A        29          1
# 3 A         4          1
# 4 A       160          2
# 5 A         4          2
# 6 A         5          2
# 7 B         1          1
# 8 B         6          1
# 9 B       170          2
#10 B         4          2
#11 B       190          3
#12 B        45          3
#13 B        60          3
#14 B         1          3

This can be done using base R :

DATA_pre$Variable_x <- with(DATA_pre, ave(days > 150, Id, FUN = cumsum)) + 1

and data.table

library(data.table)
setDT(DATA_pre)[, Variable_x:= cumsum(days > 150) + 1, Id]

Upvotes: 2

Related Questions