Reputation: 1721
computing the Time Difference based on Two rows, picked upon a condition. First Row is the Current Row, and Second Row is picked upon a condition. Condition can be Value==1 for example.
Illustratively
The Data Frame has Time Stamp and Event Occurred or Not.
Event is [ Master / B / C].
Output DataFrame should Compute, Time Left for Next Event . Eg: Master
Event took place at 12:33. So at 12:30 the Time Left for the Event=3 ie [ 12:33 - 12:30 ] in Minutes. We COuld fill NA if the last next Event is not within the Data Frame time window.
Upvotes: 1
Views: 1084
Reputation: 321
Load a test DataFrame:
df = pd.DataFrame({'year': [2019] * 5,
'month': [8] * 5,
'day': [16] * 5,
'hour': [12, 12, 12, 12, 13],
'minute': [1, 2, 3, 4, 5]})
df = pd.DataFrame(pd.to_datetime(df), columns=['Time_Stamp'])
df['Event_Master'] = [0, 0, 1, 0, 1]
Loaded DataFrame looks like:
Time_Stamp Event_Master
0 2019-08-16 12:01:00 0
1 2019-08-16 12:02:00 0
2 2019-08-16 12:03:00 1
3 2019-08-16 12:04:00 0
4 2019-08-16 13:05:00 1
To solve the problem, I first add a temporary column called 'Next_Timestamp' which at first just grabs the timestamp when given event is a 1. It is pd.NaT otherwise. We can then use the fillna method to backfill the pd.NaT values with the time of the next event for each row. It now contains the time of the next event for each row. Finally, we just subtract the 'Time_Stamp' column from the 'Next_Timestamp' column.
df['Next_Timestamp'] = df[df.Event_Master == 1].Time_Stamp
df['Next_Timestamp'].fillna(method='backfill', inplace=True)
df['TimeDiff'] = df.Next_Timestamp - df.Time_Stamp
The DataFrame now looks like:
Time_Stamp Event_Master Next_Timestamp TimeDiff
0 2019-08-16 12:01:00 0 2019-08-16 12:03:00 00:02:00
1 2019-08-16 12:02:00 0 2019-08-16 12:03:00 00:01:00
2 2019-08-16 12:03:00 1 2019-08-16 12:03:00 00:00:00
3 2019-08-16 12:04:00 0 2019-08-16 13:05:00 01:01:00
4 2019-08-16 13:05:00 1 2019-08-16 13:05:00 00:00:00
Finally, drop the temporary 'Next_Timestamp' column:
df.drop(['Next_Timestamp'], axis=1, inplace=True)
And the final DataFrame looks like:
Time_Stamp Event_Master TimeDiff
0 2019-08-16 12:01:00 0 00:02:00
1 2019-08-16 12:02:00 0 00:01:00
2 2019-08-16 12:03:00 1 00:00:00
3 2019-08-16 12:04:00 0 01:01:00
4 2019-08-16 13:05:00 1 00:00:00
Repeat for other columns as needed. Hope this helps!
Upvotes: 1