Reputation: 37
My lack of experience in working with python in the past year has made me rather rusty, and I'm getting back into coding again.
I have a dataframe of events that has a 'start_time' column. What I need to do is create an 'end_time' column that has a time value that is 1 second less than the next row's start_time. This is an ask for doing event time calculations.
The desired output:
start_time end_time
0 00:00:00 07:59:59
1 08:00:00 08:20:04
2 08:20:05 08:29:19
3 08:29:20 08:29:20
4 08:29:21 08:35:14
5 08:35:15 08:55:21
6 08:55:22 08:57:20
7 08:57:21 09:02:23
8 09:02:24 09:14:07
9 09:14:08 09:15:03
I currently have code that will accomplish this, but from anything I've read here, and from what I remember, I really shouldn't be iterating through a dataframe in a for loop.
for ndx, row in df.iterrows():
if ndx != df[atnp_df.columns[0]].count() - 1:
df.iloc[ndx, 9] = pd.to_datetime(df.iloc[ndx+1, 8]) - timedelta(seconds=1)
(Hey, it works, but it's slow...)
How do I do this pythonically? I know the solution should be something like this:
df['end_time'] = pd.to_datetime(df['start_time']) - timedelta(seconds=1)
But, this subtracts 1 second from the start_time in the same row. I'm not quite sure how to access the next row's start time in this way.
Any and all help is greatly appreciated!
Upvotes: 1
Views: 216
Reputation: 294478
offsets
df.assign(end_time=pd.to_timedelta(df.start_time).shift(-1).sub(pd.offsets.Second(1)))
start_time end_time
0 00:00:00 0 days 07:59:59
1 08:00:00 0 days 08:20:04
2 08:20:05 0 days 08:29:19
3 08:29:20 0 days 08:29:20
4 08:29:21 0 days 08:35:14
5 08:35:15 0 days 08:55:21
6 08:55:22 0 days 08:57:20
7 08:57:21 0 days 09:02:23
8 09:02:24 0 days 09:14:07
9 09:14:08 NaT
A little cleaned up and returning formatted strings:
s = pd.to_timedelta(df.start_time).shift(-1).sub(pd.offsets.Second(1))
df.assign(end_time=s.add(pd.Timestamp('now').normalize()).dt.time.astype(str))
start_time end_time
0 00:00:00 07:59:59
1 08:00:00 08:20:04
2 08:20:05 08:29:19
3 08:29:20 08:29:20
4 08:29:21 08:35:14
5 08:35:15 08:55:21
6 08:55:22 08:57:20
7 08:57:21 09:02:23
8 09:02:24 09:14:07
9 09:14:08 NaT
Upvotes: 4