Reputation: 17676
My question is similar to Efficient date range overlap calculation in python?, however, I need to calculate the overlap with a full timestamp and not days, but more importantly, I cannot specify a specific date as the overlap, rather only hours.
import pandas as pd
import numpy as np
df = pd.DataFrame({'first_ts': {0: np.datetime64('2020-01-25 07:30:25.435000'),
1: np.datetime64('2020-01-25 07:25:00')},
'last_ts': {0: np.datetime64('2020-01-25 07:30:25.718000'),
1: np.datetime64('2020-01-25 07:25:00')}})
df['start_hour'] = 7
df['start_minute'] = 0
df['end_hour'] = 8
df['end_minute'] = 0
display(df)
How can I calculate the overlap duration of the interval (first_ts, last_ts) with the second interval in milliseconds? Potentially, I would need to construct a timestamp on each day with the interval defined by the hours and then calculate the overlap.
Upvotes: 2
Views: 378
Reputation: 862671
Idea is create new Series for start and end datetimes with dates by datetimes columns, use numpy.minimum
and numpy.maximum
, subtract, convert timedeltas by Series.dt.total_seconds
and multiple by 1000
:
s = (df['first_ts'].dt.strftime('%Y-%m-%d ') +
df['start_hour'].astype(str) + ':' +
df['start_minute'].astype(str))
e = (df['last_ts'].dt.strftime('%Y-%m-%d ') +
df['end_hour'].astype(str) + ':' +
df['end_minute'].astype(str))
s = pd.to_datetime(s, format='%Y-%m-%d %H:%M')
e = pd.to_datetime(e, format='%Y-%m-%d %H:%M')
df['inter'] = ((np.minimum(e, df['last_ts']) -
np.maximum(s, df['first_ts'])).dt.total_seconds() * 1000)
print (df)
first_ts last_ts start_hour start_minute \
0 2020-01-25 07:30:25.435 2020-01-25 07:30:25.718 7 0
1 2020-01-25 07:25:00.000 2020-01-25 07:25:00.000 7 0
end_hour end_minute inter
0 8 0 283.0
1 8 0 0.0
Another idea is use only np.minumum
:
df['inter'] = (np.minimum(df['last_ts'] - df['first_ts'], e - s).dt.total_seconds() * 1000)
print (df)
first_ts last_ts start_hour start_minute \
0 2020-01-25 07:30:25.435 2020-01-25 07:30:25.718 7 0
1 2020-01-25 07:25:00.000 2020-01-25 07:25:00.000 7 0
end_hour end_minute inter
0 8 0 283.0
1 8 0 0.0
Upvotes: 3