Reputation: 31
I have big DataFrame (df) which looks like:
Acc_num date_diff 0 29 0:04:43 1 29 0:01:43 2 29 2:22:45 3 29 0:16:21 4 29 0:58:20 5 30 0:00:35 6 34 7:15:26 7 34 4:40:01 8 34 0:56:02 9 34 6:53:44 10 34 1:36:58 ..... Acc_num int64 date_diff timedelta64[ns] dtype: object
I need to calculate 'date_diff' mean (in timedelta format) for each account number.
df.date_diff.mean()
works correctly. But when I try next:
df.groupby('Acc_num').date_diff.mean()
it raises an exception:
"DataError: No numeric types to aggregate"
I also tried df.pivot_table()
method, but didn't acheive anything.
Could someone help me with this stuff. Thank you in advance!
Upvotes: 3
Views: 3288
Reputation: 1690
Weird limitation indeed. But a simple solution would be:
df.groupby('Acc_num').date_diff.agg(lambda g:g.sum()/g.count())
Edit:
Pandas will actually attempt to aggregate non-numeric columns if you pass numeric_only=False
df.groupby('Acc_num').date_diff.mean(numeric_only=False)
Upvotes: 7