chenzwei_wilson
chenzwei_wilson

Reputation: 13

Find the time of next instance in R

I got the first three columns and would like to find the fourth col (NextOrderTime), meaning that the next Orderfood == "Y" of each client.

DF:

Client    Time      Orderfood?  NextOrderTime 
 A   2021/1/7 12:00    N       2021/1/7 12:02 
 A   2021/1/7 12:01    N       2021/1/7 12:02 
 A   2021/1/7 12:02    Y       2021/1/7 12:06 
 B   2021/1/7 12:03    N       2021/1/7 12:08 
 B   2021/1/7 12:04    N       2021/1/7 12:08 
 B   2021/1/7 12:05    N       2021/1/7 12:08 
 A   2021/1/7 12:06    Y           NA 
 B   2021/1/7 12:07    N       2021/1/7 12:08 
 B   2021/1/7 12:08    Y           NA

Any help would be appreciated. Thanks for the time!

Upvotes: 1

Views: 34

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389175

Here's a way using dplyr :

We get the next Orderfood == 'Y' row number for each Client and get the corresponding Time from it.

library(dplyr)

df %>%
  group_by(Client) %>%
  mutate(NextOrderTime = replace(row_number(), Orderfood == 'N', NA),  
         NextOrderTime = lead(NextOrderTime)) %>%
  tidyr::fill(NextOrderTime, .direction = 'up') %>%
  mutate(NextOrderTime = Time[NextOrderTime])

#  Client Time           Orderfood NextOrderTime 
#  <chr>  <chr>          <chr>     <chr>         
#1 A      2021/1/7 12:00 N         2021/1/7 12:02
#2 A      2021/1/7 12:01 N         2021/1/7 12:02
#3 A      2021/1/7 12:02 Y         2021/1/7 12:06
#4 B      2021/1/7 12:03 N         2021/1/7 12:08
#5 B      2021/1/7 12:04 N         2021/1/7 12:08
#6 B      2021/1/7 12:05 N         2021/1/7 12:08
#7 A      2021/1/7 12:06 Y         NA            
#8 B      2021/1/7 12:07 N         2021/1/7 12:08
#9 B      2021/1/7 12:08 Y         NA            

data

df <- structure(list(Client = c("A", "A", "A", "B", "B", "B", "A", 
"B", "B"), Time = c("2021/1/7 12:00", "2021/1/7 12:01", "2021/1/7 12:02", 
"2021/1/7 12:03", "2021/1/7 12:04", "2021/1/7 12:05", "2021/1/7 12:06", 
"2021/1/7 12:07", "2021/1/7 12:08"), Orderfood = c("N", "N", 
"Y", "N", "N", "N", "Y", "N", "Y")), row.names = c(NA, -9L),class = "data.frame")

Upvotes: 1

Dubukay
Dubukay

Reputation: 2071

This seems to do what you want:

df$NextOrderTime2 <- mapply(df$Client, df$Time, FUN = function(client, time){
  all_orders <- df[df$Client==client&df$`Orderfood?`=="Y"&df$Time>time,]
  format(all_orders$Time[1], format = "%Y/%m/%d %H:%M")
}, USE.NAMES = FALSE)

output

Essentially, we loop over each row and select the first time that belongs to the same client, involved ordering food, and is after the given row's time.

Upvotes: 0

Related Questions