T.McMillen
T.McMillen

Reputation: 137

Two dataframes with identical columns, subset one if column time is within 24 hours in R

I have two data frames with identical variable titles, I would like to subset df1 by df2 if there is a match in a person's name and if their order time is less than 24 hours.

Name <- c("MCCARTNEY, PAUL", "STARR, RNGO", "HARRISON, GEORGE", "LENNON, JOHN")
Order_TM <-c("3/4/2020 15:16", "3/4/2020 15:16", "3/4/2020 15:16","3/4/2020 19:30")
df1 <-data.frame(Name, Order_TM)

In df2 I have the same Names but different order times

Name <- c("MCCARTNEY, PAUL", "STARR, RNGO", "HARRISON, GEORGE", "LENNON, JOHN")
Order_TM <-c("3/4/2020 18:16", "3/4/2020 20:16", "3/6/2020 15:16","3/5/2020 12:00")
df2 <-data.frame(Name, Order_TM)

I want to subset df1 if the Order_TM is less than 24 hours from the Order_TM in df2. Based on my example, the result would be MCCARTNEY, PAUL, STARR, RINGO and LENNON, JOHN. But I haven't been able to find a way to do this.

Upvotes: 0

Views: 37

Answers (2)

Ian Campbell
Ian Campbell

Reputation: 24838

If you prefer a dplyr solution, you could try this:

library(dplyr)
library(lubridate)
df1 %>% 
  left_join(df2,by = "Name") %>%
  mutate(Order_TM = Order_TM.x, TimeDiff = mdy_hm(Order_TM.x) - mdy_hm(Order_TM.y)) %>%
  filter(abs(TimeDiff) <= 24) %>%
  dplyr::select(-Order_TM.y,-Order_TM.x)
#             Name       Order_TM    TimeDiff
#1 MCCARTNEY, PAUL 3/4/2020 15:16  -3.0 hours
#2     STARR, RNGO 3/4/2020 15:16  -5.0 hours
#3    LENNON, JOHN 3/4/2020 19:30 -16.5 hours

Upvotes: 1

IceCreamToucan
IceCreamToucan

Reputation: 28695

# convert data frames to data.tables
library(data.table)
setDT(df1)
setDT(df2)

# convert Order_TM to datetime format
df1[, Order_TM := as.POSIXct(Order_TM, format = '%m/%d/%Y %R')]
df2[, Order_TM := as.POSIXct(Order_TM, format = '%m/%d/%Y %R')]

# join to find difference in hours between datetimes
df1[df2, on = .(Name), time_diff := abs(difftime(i.Order_TM, Order_TM, 'hours'))]

# subset based on time difference
df1[time_diff < 24]
#               Name            Order_TM  time_diff
# 1: MCCARTNEY, PAUL 2020-03-04 15:16:00  3.0 hours
# 2:     STARR, RNGO 2020-03-04 15:16:00  5.0 hours
# 3:    LENNON, JOHN 2020-03-04 19:30:00 16.5 hours

Upvotes: 1

Related Questions