user3904098
user3904098

Reputation:

R: count the number of previous rows with the same value

I have access to a data frame which contains:

I need to compute for each row the inactivity period known at this date, which is the number of day from the last visit to the date of the current row.

The date of the last visit can be deduced from the nb_visit_life_to_date by counting the number of previous rows with the same value.

For instance if I have 3 rows about the same user with the same numbers of Life To Date visits then I should recover for this 3rd rows an inactivity period of 2 days.

Example with real data:

input <- data.frame(
  user = c(1,1,1,1,1,2,2,2,2,2),
  date = c(1,2,3,4,5,1,2,3,4,5),
  nb_visit_life_to_date = c(1,1,1,2,3,1,2,2,2,2)
)

output <- data.frame(
  input,
  inactivity_period_from_previous_visit = c(0,1,2,0,0,0,0,1,2,3)
)

Ideally I'd like to use a dplyr syntax but i'm of course open to all solutions.

Upvotes: 0

Views: 841

Answers (1)

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193507

This is a straightforward rle (run length encoding) task:

sequence(rle(input$nb_visit_life_to_date)$lengths) - 1
#  [1] 0 1 2 0 0 0 0 1 2 3

Upvotes: 2

Related Questions