aim
aim

Reputation: 311

group by average no of hours

i have data df:

month   name    duration
5       abc     09:03:00
5       abc     09:09:00
5       eef     10:03:00
5       eef     09:03:00
5       eef     09:03:00
5       ttk     10:03:00
5       abc     09:03:00
5       ttk     09:23:00
6       ttk     09:03:00
6       abc     11:03:00

I need to groupby this to get results like this:

month   name    name_size   name_nuique duration_mean
5       abc     3           1           09:05:00
        eef     3           1           09:23:00
        ttk     2           1           09:43:00
6       abc     1           1           09:03:00
        ttx     1           1           11:03:00

Please let me know how to do this. I tried doing

df.groupby(['month','name'], sort=False).agg({'name':['size','nunique'],
          'duration':['mean']})

But i get error saying:

No numeric types to aggregate

Upvotes: 1

Views: 52

Answers (1)

DYZ
DYZ

Reputation: 57033

The standard pandas mean() cannot handle timedeltas. After converting the column to timedeltas, apply a custom lambda function to it:

df["duration"] = pd.to_timedelta(df["duration"])
df.groupby(["month", "name"], sort=False).agg({
                               "name": ["size", "nunique"],
                               "duration": [lambda x: x.mean()]})

Upvotes: 1

Related Questions