NineWasps
NineWasps

Reputation: 2253

Time difference: incorrect values of seconds

I try to count the time difference between each 2 rows for groupby data by id. Data looks like

id    date
11    2021-02-04 10:34:46+03:00
11    2021-02-07 14:58:24+03:00
11    2021-02-07 19:23:28+03:00
11    2021-02-08 10:21:44+03:00
11    2021-02-09 11:36:09+03:00

I use that:

df['time_diff'] = df.groupby('id')['date'].diff().dt.seconds.div(60).fillna(0)

I've noticed that my result is incorrect. And when I use just it

df.groupby('id')['date'].diff()

I get that and it's correct

70225                NaT
72324    3 days 04:23:38
72367    0 days 04:25:04
72515    0 days 14:58:16
73343    1 days 01:14:25
...

But when I try to convert it into seconds

df.groupby('id')['date'].diff().dt.seconds

I get

70225         NaN
72324     15818.0
72367     15904.0
72515     53896.0
73343      4465.0
...

Why might it happen?

Upvotes: 1

Views: 305

Answers (1)

NicholasM
NicholasM

Reputation: 4673

It's very difficult to answer this without a reproducible example, or an understanding of your desired behavior.

I suspect that you can do this with pd.Series.dt.total_seconds():

df.groupby('id')['date'].diff().dt.total_seconds()

If that doesn't work, you could try something like:

df.groupby('id')['date'].diff() / pd.Timedelta(seconds=1)

Upvotes: 1

Related Questions