Reputation: 247
I am trying to calculate time duration inside of each sliding window for this data:
ID
DATE
2017-05-17 15:49:51 2
2017-05-17 15:49:52 5
2017-05-17 15:49:55 2
2017-05-17 15:49:56 3
2017-05-17 15:49:58 5
2017-05-17 15:49:59 5
In this example DATE
is the index, and I am trying to get the duration inside rolling window of size 3 which overlap each other. Answer should be like this:
ID duration
DATE
2017-05-17 15:49:51 2 4
2017-05-17 15:49:52 5 4
2017-05-17 15:49:55 2 3
2017-05-17 15:49:56 3 3
2017-05-17 15:49:58 5 NaN
2017-05-17 15:49:59 5 NaN
I tried:
df['duration'] = df.rolling(window=3).apply(df.index.max()-df.index.min())
But I got this error:
TypeError: 'DatetimeIndex' object is not callable
Upvotes: 2
Views: 581
Reputation: 1
def timediff(time_window: pd.Series) -> float:
duration = time_window.index.max() - time_window.index.min()
return duration.total_seconds()
df['duration'] = np.nan
df['duration'] = df.duration.rolling(window=3).apply(func=timediff, raw=False)
I've just stumbled across this question and wanted to provide a solution using the rolling window approach:
with raw=False
(default) you provide a Series to the function, so you can use index.max() - index.min()
or index[-1] - index[0]
The only problem is that you need to return a number and not a timedelta object.
Upvotes: 0
Reputation: 450
df.reset_index(inplace=True)
df['PREVIOUS_TIME']= df.DATE.shift(-2)
df['duration']=(df.PREVIOUS_TIME-df.DATE)/np.timedelta64(1,'s')
df.drop('PREVIOUS_TIME',axis=1,inplace=True)
df.set_index('DATE',inplace=True)
Assuming that 'DATE' is a datetime.
Upvotes: 4