Reputation: 65
I have a raw table as show below:
ColumnA(Index) | StarTime | EndTime |
---|---|---|
A | 2022-03-16 13:07:28 | 2022-03-16 13:26:10 |
A | 2022-03-16 13:38:28 | 2022-03-16 13:40:28 |
B | 2022-03-16 14:01:28 | 2022-03-16 14:10:28 |
C | 2022-03-16 14:19:28 | 2022-03-16 14:29:28 |
C | 2022-03-16 18:10:28 | 2022-03-16 18:18:28 |
C | 2022-03-16 18:28:28 | 2022-03-16 18:50:28 |
Question
I am seeking help to create a new data frame where there is a new column for each Index item called Interval (in minutes) which is equal to StartTime of the next row of index item(if it exists) minus EndTime of the previous row Item. For the first row of each index, the Interval should be 0.
Expected Output Table:
ColumnA(Index) | StarTime | EndTime | Interval(mins) |
---|---|---|---|
A | 2022-03-16 13:07:28 | 2022-03-16 13:26:10 | 0 |
A | 2022-03-16 13:38:28 | 2022-03-16 13:40:28 | 12 |
B | 2022-03-16 14:01:28 | 2022-03-16 14:10:28 | 0 |
C | 2022-03-16 14:19:28 | 2022-03-16 14:29:28 | 0 |
C | 2022-03-16 18:10:28 | 2022-03-16 18:18:28 | 10 |
C | 2022-03-16 18:28:28 | 2022-03-16 18:50:28 | 10 |
Upvotes: 0
Views: 68
Reputation: 23146
Try:
df["StarTime"] = pd.to_datetime(df["StarTime"])
df["EndTime"] = pd.to_datetime(df["EndTime"])
df = df.sort_values(["ColumnA(Index)","StarTime"])
df["Interval(mins)"] = df["StarTime"].sub(df["EndTime"].shift()).dt.total_seconds().div(60).where(df["ColumnA(Index)"].eq(df["ColumnA(Index)"].shift())).fillna(0)
>>> df
ColumnA(Index) StarTime EndTime Interval(mins)
0 A 2022-03-16 13:07:28 2022-03-16 13:26:10 0.0
1 A 2022-03-16 13:38:28 2022-03-16 13:40:28 12.3
2 B 2022-03-16 14:01:28 2022-03-16 14:10:28 0.0
3 C 2022-03-16 14:19:28 2022-03-16 14:29:28 0.0
4 C 2022-03-16 18:10:28 2022-03-16 18:18:28 221.0
5 C 2022-03-16 18:28:28 2022-03-16 18:50:28 10.0
Upvotes: 1