Reputation: 1117
My df looks like this.
date pre_date time_delta
0 2019-05-13 10:45:57 2019-05-13 10:45:57 00:00:00
1 2019-05-13 14:22:22 2019-05-13 10:45:57 03:36:25
2 2019-05-13 14:32:22 2019-05-13 14:22:22 00:10:00
3 2019-05-14 03:58:27 2019-05-13 14:32:22 13:26:05
4 2019-05-14 04:08:27 2019-05-14 03:58:27 00:10:00
5 2019-05-14 04:28:27 2019-05-14 04:08:27 00:20:00
My goal is to filter it by 'time_delta' column's value in the efficient way as possible.
sec = 500
df = df[(d_df['time_delta']>(pd.Timedelta(sec, unit='s'))) | ((df['time_delta']==pd.Timedelta(0, unit='s')))]
it's works better than for loop but it still slow. any suggestion?
Upvotes: 2
Views: 315
Reputation: 402493
If "time_delta" is a column of timedelta
type, then you can extract the total seconds as an integer (or float) and compare:
delta = df['time_delta'].dt.total_seconds()
df[(delta == 0) | (delta > sec)]
Upvotes: 1
Reputation: 150745
Does this work for you:
mask = (df['Time_delta'].dt
.seconds
.between(0,sec,
inclusive=False)
)
df = df[mask]
Upvotes: 1