S. Ash
S. Ash

Reputation: 68

How can I merge these two datasets based on all closest timestamps?

I have two dataframes and I want to merge them based on timestamps, but keep all timestamps. Essentially, to combine each med timestamp (dataframe a) with all the lab timestamps (dataframe b) before and after it until the next med timestamp.

I've tried just merging them and doing a rolling join.

I want to merge dataframe a with dataframe b to get dataframe c.

first dataframe - med times (a)

a<-data.frame("Patient" = c(rep("A", times = 2)),"Med_Time" = c(as.POSIXct("2018-05-11 10:37"), as.POSIXct("2018-05-12 17:16")))

second dataframe - lab times (b)

b<-data.frame("Patient" = c(rep("A", times = 13)),"Lab_Time" = c(as.POSIXct("2018-05-11 02:15:00"),
             as.POSIXct("2018-05-11 06:25:00"),
             as.POSIXct("2018-05-11 12:45:00"),
             as.POSIXct("2018-05-11 16:51:00"),
             as.POSIXct("2018-05-11 21:51:00"),
             as.POSIXct("2018-05-12 05:46:00"),
             as.POSIXct("2018-05-12 12:42:00"),
             as.POSIXct("2018-05-12 17:09:00"),
             as.POSIXct("2018-05-12 21:16:00"),
             as.POSIXct("2018-05-13 06:04:00"),
             as.POSIXct("2018-05-13 10:45:00"),
             as.POSIXct("2018-05-13 16:02:00"),
             as.POSIXct("2018-05-13 21:40:00")),"Lab_Res" = c(70,80,122,180,161,170,210,212,278,156,172,174,165))

expected result (c)

c<-data.frame("Patient" = c(rep("A", times = 13)),"Med_Time" = c(rep(as.POSIXct("2018-05-11 10:37:00"), times = 8),
             rep(as.POSIXct("2018-05-12 17:16:00"), times = 5)),"Lab_Time" = c(as.POSIXct("2018-05-11 02:15:00"),
             as.POSIXct("2018-05-11 06:25:00"),
             as.POSIXct("2018-05-11 12:45:00"),
             as.POSIXct("2018-05-11 16:51:00"),
             as.POSIXct("2018-05-11 21:51:00"),
             as.POSIXct("2018-05-12 05:46:00"),
             as.POSIXct("2018-05-12 12:42:00"),
             as.POSIXct("2018-05-12 17:09:00"),
             as.POSIXct("2018-05-12 21:16:00"),
             as.POSIXct("2018-05-13 06:04:00"),
             as.POSIXct("2018-05-13 10:45:00"),
             as.POSIXct("2018-05-13 16:02:00"),
             as.POSIXct("2018-05-13 21:40:00")),"Lab_Res" = c(70,80,122,180,161,170,210,212,278,156,172,174,165))

Any insight would be helpful! Thank you!

Upvotes: 0

Views: 184

Answers (1)

akrun
akrun

Reputation: 886948

We can use data.table join

library(data.table)
setDT(a)[setDT(b)[, .(Patient, Med_Time = Lab_Time, Lab_Time, Lab_Res)], 
      on = .(Patient, Med_Time), roll = -Inf]

Upvotes: 1

Related Questions