Reputation:
I have access to a data frame which contains:
I need to compute for each row the inactivity period known at this date, which is the number of day from the last visit to the date of the current row.
The date of the last visit can be deduced from the nb_visit_life_to_date by counting the number of previous rows with the same value.
For instance if I have 3 rows about the same user with the same numbers of Life To Date visits then I should recover for this 3rd rows an inactivity period of 2 days.
Example with real data:
input <- data.frame(
user = c(1,1,1,1,1,2,2,2,2,2),
date = c(1,2,3,4,5,1,2,3,4,5),
nb_visit_life_to_date = c(1,1,1,2,3,1,2,2,2,2)
)
output <- data.frame(
input,
inactivity_period_from_previous_visit = c(0,1,2,0,0,0,0,1,2,3)
)
Ideally I'd like to use a dplyr
syntax but i'm of course open to all solutions.
Upvotes: 0
Views: 841
Reputation: 193507
This is a straightforward rle
(run length encoding) task:
sequence(rle(input$nb_visit_life_to_date)$lengths) - 1
# [1] 0 1 2 0 0 0 0 1 2 3
Upvotes: 2