MPA
MPA

Reputation: 1117

Filter DataFrame based on total number of seconds in a timedelta column

My df looks like this.

      date            pre_date              time_delta
0 2019-05-13 10:45:57 2019-05-13 10:45:57   00:00:00
1 2019-05-13 14:22:22 2019-05-13 10:45:57   03:36:25
2 2019-05-13 14:32:22 2019-05-13 14:22:22   00:10:00
3 2019-05-14 03:58:27 2019-05-13 14:32:22   13:26:05
4 2019-05-14 04:08:27 2019-05-14 03:58:27   00:10:00
5 2019-05-14 04:28:27 2019-05-14 04:08:27   00:20:00

My goal is to filter it by 'time_delta' column's value in the efficient way as possible.

sec = 500
df = df[(d_df['time_delta']>(pd.Timedelta(sec, unit='s'))) | ((df['time_delta']==pd.Timedelta(0, unit='s')))]

it's works better than for loop but it still slow. any suggestion?

Upvotes: 2

Views: 315

Answers (2)

cs95
cs95

Reputation: 402493

If "time_delta" is a column of timedelta type, then you can extract the total seconds as an integer (or float) and compare:

delta = df['time_delta'].dt.total_seconds()
df[(delta == 0) | (delta > sec)]

Upvotes: 1

Quang Hoang
Quang Hoang

Reputation: 150745

Does this work for you:

mask = (df['Time_delta'].dt
          .seconds
          .between(0,sec,
                   inclusive=False)
       )

df = df[mask]

Upvotes: 1

Related Questions