Reputation: 377
I tried using the following code, but it is not taking the difference in the group and throwing random values.
data.join(data.groupby(['a']).b.apply(lambda x: x.sort_values().diff().dt.seconds).fillna(0).reset_index(level=[0,1], drop=True).rename('timediff'))
My data looks something like:
| | a | b |
|---|---|----------------------------------|
| 0 | a | 2020-08-01 09:43:00.728131+05:30 |
| 1 | b | 2020-11-19 18:04:38.434056+05:30 |
| 2 | c | 2021-05-02 17:02:30.065254+05:30 |
| 3 | d | 2020-03-07 16:20:17.250333+05:30 |
| 4 | d | 2020-03-07 15:48:25.670649+05:30 |
| 5 | d | 2020-03-07 15:48:39.165078+05:30 |
| 6 | d | 2020-03-07 16:20:07.325826+05:30 |
I want something like
| | a | b |timediff(sec)|
|---|---|----------------------------------|-------------|
| 0 | a | 2020-08-01 09:43:00.728131+05:30 | 0 |
| 1 | b | 2020-11-19 18:04:38.434056+05:30 | 0 |
| 2 | c | 2021-05-02 17:02:30.065254+05:30 | 0 |
| 3 | d | 2020-03-07 16:20:17.250333+05:30 | 7 |
| 4 | d | 2020-03-07 15:48:25.670649+05:30 | 0 |
| 5 | d | 2020-03-07 15:48:39.165078+05:30 | 14 |
| 6 | d | 2020-03-07 16:20:07.325826+05:30 | 1888.160748 |
| 7 | a | 2020-08-01 09:44:00.728131+05:30 | 60 |
I need the time difference from the previous time for a particular group.
Time_difference = time_value - previous_time_value
Thanks in advance!
Upvotes: 1
Views: 91
Reputation: 862661
Sorting before groupby
, so possible avoid lambda function and use instead DataFrameGroupBy.diff
:
data.join(data.sort_values(['a','b']).groupby('a').b.diff().dt.total_seconds().fillna(0).rename('timediff'))
Upvotes: 1