Reputation: 1521
I have a table has column ID, prd, line. I would like to revise the line definition for each ID as long as the prd is the same with previous record, then keep the line same as the column new_line in the example below.
new_data <- data.frame('ID'=c('123','123','123','456','456','789','789','789'),
'prd' = c('a','a','b','b','a','c','c','d'),
'line' = c(1,2,3,1,2,2,3,4))
Is there a way to do that?
Upvotes: 0
Views: 44
Reputation: 388982
Using dplyr
, one way using lag
and cumsum
would be
library(dplyr)
new_data %>%
group_by(ID) %>%
mutate(new_line = first(line) + cumsum(prd != lag(prd, default = first(prd))))
# ID prd line new_line
# <fct> <fct> <dbl> <dbl>
#1 123 a 1 1
#2 123 a 2 1
#3 123 b 3 2
#4 456 b 1 1
#5 456 a 2 2
#6 789 c 2 2
#7 789 c 3 2
#8 789 d 4 3
Using data.table
rleid
it becomes a bit shorter
new_data %>%
group_by(ID) %>%
mutate(new_line = first(line) + data.table::rleid(prd) - 1)
Upvotes: 2