TrevorRi
TrevorRi

Reputation: 11

How can I subtract 2 dataframes of different length by searching for the closest timestamp in R?

I am using R to extract data from a process historian using SQL. I have two dataframes, one of net weights (NetWt) with timestamps (100 rows) and another of weight setpoints (SetPt) with timestamps (6 rows). The setpoint is changed infrequently but a new bag weight is recorded every 30 seconds. I need to subtract the two such that I get a resultant dataframe of NetWt - SetPt for each timestamp in NetWt. In my last dataset the most recent SetPt timestamp is earlier than the first NetWt timestamp. I need a function that will go through each row in NetWt, take the timestamp, search for the closest timestamp before that time in the SetPt dataframe, return the most recent SetPt and output the difference (NetWt-SetPt).

I have researched merge, rbind, cbind, and I can't find a function to search backwards for the most recent SetPt value and merge that with the NetWt so that I can subtract them to plot the difference with time. Can anyone please help?

Data:

SetPtLines <- "Value,DateTime
51.35,2014-02-10 08:10:49
53.30,2014-02-10 07:52:37
53.10,2014-02-10 07:52:19
51.70,2014-02-10 07:50:26
51.35,2014-02-09 19:25:21
51.40,2014-02-09 19:13:11
51.50,2014-02-09 18:24:53
51.45,2014-02-09 16:10:38
51.40,2014-02-09 15:54:42"
SetPt <- read.csv(text=SetPtLines, header=TRUE)

NetWtLines <- "DateTime,Value
2014-02-11 12:51:50,50.90735
2014-02-11 12:52:24,50.22308
2014-02-11 12:52:55,50.88604
2014-02-11 12:53:27,50.69514
2014-02-11 12:53:58,51.38968
2014-02-11 12:54:29,50.96672"
NetWt <- read.csv(text=NetWtLines, header=TRUE)

There are 100 rows in NetWt.

Upvotes: 1

Views: 584

Answers (2)

Joshua Ulrich
Joshua Ulrich

Reputation: 176688

Here's a solution using xts. Note that your example would be more helpful if SetPt and NetWt included some overlapping observations.

library(xts)
# convert your data to xts
xSetPt <- xts(SetPt$Value, as.POSIXct(SetPt$DateTime))
xNetWt <- xts(NetWt$Value, as.POSIXct(NetWt$DateTime))
# merge them
xm <- merge(xNetWt, xSetPt)
# fill all missing values in the SetPt column with their prior value
xm$xSetPt <- na.locf(xm$xSetPt)
# plot the difference
plot(na.omit(xm$xNetWt - xm$xSetPt))

Upvotes: 0

Ricardo Saporta
Ricardo Saporta

Reputation: 55390

data.table has a roll argument which would probably be very helpful here

library(data.table)
NetWt <- as.data.table(NetWt)
SetPt <- as.data.table(SetPt)

## Only needed if dates are strings: 
## Ensure that your DateTime columns are actually times and not strings
NetWt[, DateTime := as.POSIXct(DateTime)]
SetPt[, DateTime := as.POSIXct(DateTime)]

## Next, set keys to the dates
setkey(NetWt, DateTime)
setkey(SetPt, DateTime)

## Join the two, use roll
NetWt[SetPt, NewValue := Value - i.Value, roll="nearest"]

## It's not clear which you want to subtract from which
SetPt[NetWt, NewValue := Value - i.Value, roll="nearest"]

Upvotes: 2

Related Questions