Reputation: 508
I would like to do a transform on data that includes timedelta format.
My data looks like this with Time column as timedelta type.
user in out location overlap Time
0 ron 12/21/2021 10:11 12/21/2016 17:50 home 0 4:19:03
1 ron 12/21/2016 13:26 12/21/2016 13:52 office 2 0:25:28
2 april 12/21/2016 8:12 12/21/2016 17:27 office 0 8:15:03
3 april 12/21/2016 18:54 12/21/2016 22:56 office 0 4:02:36
4 andy 12/21/2016 8:57 12/21/2016 12:15 home 0 2:59:40
Based on user and overlap, I would like to do a transform on 'Time'. I've done this:
groups = sample.groupby('user')['Time']
flag = sample.groupby('user')['overlap'].transform('max')
sample.loc[:,'time_new'] = np.select([flag.eq(0), flag.isin([1,2])], [groups.transform('sum'), groups.transform('max')])
But I get the following error:
TypeError: Cannot cast scalar from dtype('<m8[ns]') to dtype('<m8') according to the rule 'same_kind'
How can I do the transform properly?
Upvotes: 1
Views: 62
Reputation: 59579
Convert the timedelta to a float
that counts the number of seconds, and do the math. Then if you want, convert back to a timedelta
groups = sample['Time'].dt.total_seconds().groupby(sample['user'])
flag = sample.groupby('user')['overlap'].transform('max')
sample.loc[:,'time_new'] = np.select([flag.eq(0), flag.isin([1,2])],
[groups.transform('sum'), groups.transform('max')])
sample['time_new'] = pd.to_timedelta(sample['time_new'], unit='s')
user in out location overlap Time time_new
0 ron 12/21/2021 10:11 12/21/2016 17:50 home 0 04:19:03 04:19:03
1 ron 12/21/2016 13:26 12/21/2016 13:52 office 2 00:25:28 04:19:03
2 april 12/21/2016 8:12 12/21/2016 17:27 office 0 08:15:03 12:17:39
3 april 12/21/2016 18:54 12/21/2016 22:56 office 0 04:02:36 12:17:39
4 andy 12/21/2016 8:57 12/21/2016 12:15 home 0 02:59:40 02:59:40
Upvotes: 1