trench
trench

Reputation: 5355

Matching range of timestamps in pandas

I am having trouble doing this in pandas

DF1 (my left join):

Name |     TimeStart        | TimeEnd               | Values   | Order
John   12/24/2014 08:10:32    12/24/2014 08:14:21        2        1
John   12/24/2014 08:15:03    12/24/2014 08:22:49        2        2

DF2

Name |     TimeStart        | TimeEnd               | Values
John   12/24/2014 08:12:57    12/24/2014 08:13:31        8

TimeStart in DF2 is ALWAYS greater than the TimeStart in DF1 (this typically happens during the interaction). And for it to match, it would have to be less than the next row of data for that individual.

Here is my thought process. Shift the row to columns to see if they match. Then compare the DF2 TimeStart to be > than the initial TimeStart on DF1 but less than the next row TimeStart (same Name).

df1.sort(['Name', 'TimeStart'], ascending=[1, 1], inplace = True)
df1['Name_R'] = df1['Name'].shift(-1)
df1['Matching Row'] = np.where((df1['Name_R'] == df1['Name']), 1, 0)
df1['Next Timestamp'] = np.where(df1['Matching Row'] == 1, df1['TimeStart'].shift(-1), np.datetime64('nat'))
df1['test'] = np.where(df2['TimeStart'] > df1['TimeStart'] < df1['Next Timestamp'], 1, 0)

Edit - is it possible to do this with a asof command? The only trick is that the Name has to match, then we look for the closest timestamp for TimeStart on each file/dataframe.

Upvotes: 0

Views: 567

Answers (1)

chrisaycock
chrisaycock

Reputation: 37930

There is an "asof join" in pandas 0.19. For your example, just ignore the starting time and join by the most recent ending time.

pd.merge_asof(DF1, DF2, on='TimeEnd')

Upvotes: 1

Related Questions