Matt Munson
Matt Munson

Reputation: 3003

Match each row in a table to a row in another table based on the difference between row timestamps

I have two unevenly-spaced time series that each measure separate attributes of the same system. The two series's data points are not sampled at the same times, and the series are not the same length. I would like to match each row from series A to the row of B that is closest to it in time. What I have in mind is to add a column to A that contains indexes to the closest row in B. Both series have a time column measured in Unix time (eg. 1459719755).

for example, given two datasets

a  time
2  1459719755
4  1459719772
3  1459719773

b  time
45 1459719756
2  1459719763
13 1459719766
22 1459719774

The first dataset should be updated to

a  time        index
2  1459719755  1
4  1459719772  4
3  1459719773  4

since B[1,]$time has the closest value to A[1,]$time, B[4,]$time has the closest value to A[2,]$time and A[3,]$time.

Is there any convenient way to do this?

Upvotes: 0

Views: 48

Answers (1)

IRTFM
IRTFM

Reputation: 263331

Try something like this:

(1+ecdf(bdat$time)(adat$time)*nrow(bdat))
[1] 1 4 4

Why should this work? The ecdf function returns another function that has a value from 0 to 1. It returns the "position" in the "probability range" [0,1] of a new value in a distribution of values defined by the first argument to ecdf. The expression is really just rescaling that function's result to the range [1, nrow(bdat)]. (I think it's flipping elegant.)

Another approach would be to use approxfun on the sorted values of bdat$time which would then let get you interpolated values. These might need to be rounded. Using them as indices would instead truncate to integer.

apf <- approxfun( x=sort(bdat$time), y=seq(length( bdat$time))  ,rule=2)
apf( adat$time)
#[1] 1.000 3.750 3.875
round( apf( adat$time))
#[1] 1 4 4

In both case you are predicting a sorted value from its "order statistic". In the second case you should check that ties are handled in the manner you desire.

Upvotes: 1

Related Questions