Finding value of the next row that satisfies a condition in R

Question

I have this dataframe:

    plans     date                    expiring_date
1   premium   2020-07-12 20:26:54     2020-08-11 20:26:54
2   premium   2020-08-15 00:11:54     2020-09-14 00:11:54
3   premium   2020-09-14 19:59:25     2020-10-14 19:59:25
4   single    2020-09-14 20:02:41     NA
5   double    2020-09-17 16:31:07     NA
6   double    2020-10-07 09:54:38     NA
7   premium   2020-10-15 12:06:57     2020-11-14 12:06:57
8   double    2020-10-15 12:08:46     NA
9   single    2020-10-15 12:13:59     NA

which is produced by this:

df <- data.frame(plans = c("premium", "premium", "premium", "single", "double", "double", "premium", "double", "single"),
                 date = as.POSIXct(c("2020-07-12 20:26:54", "2020-08-15 00:11:54", "2020-09-14 19:59:25",
                                     "2020-09-14 20:02:41", "2020-09-17 16:31:07", "2020-10-07 09:54:38",
                                     "2020-10-15 12:06:57", "2020-10-15 12:08:46", "2020-10-15 12:13:59")),
                 expiring_date = as.POSIXct(c("2020-08-11 20:26:54", "2020-09-14 00:11:54", "2020-10-14 19:59:25",
                                              NA, NA, NA, "2020-11-14 12:06:57", NA, NA)))

What I want is to add a new column that gets the date of the next row with a premium plan for the rows that have a premium plan. So the resulting dataframe will be this:

    plans     date                    expiring_date          next_premium_date
1   premium   2020-07-12 20:26:54     2020-08-11 20:26:54    2020-08-15 00:11:54
2   premium   2020-08-15 00:11:54     2020-09-14 00:11:54    2020-09-14 19:59:25
3   premium   2020-09-14 19:59:25     2020-10-14 19:59:25    2020-10-15 12:06:57
4   single    2020-09-14 20:02:41     NA                     NA
5   double    2020-09-17 16:31:07     NA                     NA
6   double    2020-10-07 09:54:38     NA                     NA
7   premium   2020-10-15 12:06:57     2020-11-14 12:06:57    NA
8   double    2020-10-15 12:08:46     NA                     NA
9   single    2020-10-15 12:13:59     NA                     NA

I am trying with dplyr library because I want to use the group_by function since I have multiple users and I want to do this for every user, but no result so far. Any ideas?

Karthik S · Accepted Answer

Does this work:

> library(dplyr)
> df %>% filter(plans == 'premium') %>% mutate(next_premium_date = lead(date)) %>% 
+   right_join(df) %>% arrange(date)
Joining, by = c("plans", "date", "expiring_date")
    plans                date       expiring_date   next_premium_date
1 premium 2020-07-12 20:26:54 2020-08-11 20:26:54 2020-08-15 00:11:54
2 premium 2020-08-15 00:11:54 2020-09-14 00:11:54 2020-09-14 19:59:25
3 premium 2020-09-14 19:59:25 2020-10-14 19:59:25 2020-10-15 12:06:57
4  single 2020-09-14 20:02:41                                
5  double 2020-09-17 16:31:07                                
6  double 2020-10-07 09:54:38                                
7 premium 2020-10-15 12:06:57 2020-11-14 12:06:57                
8  double 2020-10-15 12:08:46                                
9  single 2020-10-15 12:13:59

Finding value of the next row that satisfies a condition in R

Answers (2)

Related Questions