Reputation: 2253
I try to count the time difference between each 2 rows for groupby data by id
. Data looks like
id date
11 2021-02-04 10:34:46+03:00
11 2021-02-07 14:58:24+03:00
11 2021-02-07 19:23:28+03:00
11 2021-02-08 10:21:44+03:00
11 2021-02-09 11:36:09+03:00
I use that:
df['time_diff'] = df.groupby('id')['date'].diff().dt.seconds.div(60).fillna(0)
I've noticed that my result is incorrect. And when I use just it
df.groupby('id')['date'].diff()
I get that and it's correct
70225 NaT
72324 3 days 04:23:38
72367 0 days 04:25:04
72515 0 days 14:58:16
73343 1 days 01:14:25
...
But when I try to convert it into seconds
df.groupby('id')['date'].diff().dt.seconds
I get
70225 NaN
72324 15818.0
72367 15904.0
72515 53896.0
73343 4465.0
...
Why might it happen?
Upvotes: 1
Views: 305
Reputation: 4673
It's very difficult to answer this without a reproducible example, or an understanding of your desired behavior.
I suspect that you can do this with pd.Series.dt.total_seconds()
:
df.groupby('id')['date'].diff().dt.total_seconds()
If that doesn't work, you could try something like:
df.groupby('id')['date'].diff() / pd.Timedelta(seconds=1)
Upvotes: 1