Reputation: 1
This is what my simplified dataframe looks like -
App IsNewSession
Word TRUE
Excel FALSE
Chrome TRUE
Notepad FALSE
Chrome FALSE
Notepad FALSE
Excel TRUE
Chrome FALSE
I need to create a new column called SessionNumber. Each time IsNewSession = TRUE, the session number should be the previous row's session number + 1. Otherwise, it just retains the same session number as the previous row.
Desired data frame -
App IsNewSession SessionNumber
Word TRUE 1
Excel FALSE 1
Chrome TRUE 2
Notepad FALSE 2
Chrome FALSE 2
Notepad FALSE 2
Excel TRUE 3
Chrome FALSE 3
I can do this using a for loop but my dataframe is pretty large (250K rows) and it takes a really long time.
I tried using mutate like this, but that doesn't work either. df$SessionNumber = 1
library(dplyr)
df <- df %>%
mutate(SessionNumber = ifelse(IsNewSession, lag(SessionNumber) + 1, lag(SessionNumber)))
What is a good performant way to do this in R?
Thanks!
Upvotes: 0
Views: 338
Reputation: 1364
The question in the comment doesn't work if the first value is FALSE
.
df$SessionNumber <- cumsum(df$IsNewSession) + as.numeric(!df$SessionNumber[1])
Upvotes: 1