Reputation: 4398
I would like to group by multiple columns and perform several different aggregations. Grouping by type and date and taking average of en, en2, stat1 and stat2.
Data
type en en2 date stat1 stat2
aa 40 80 1/1/2021 1 1
aa 20 20 1/1/2021 2 1
aa 10 10 1/1/2021 3 5
bb 10 10 1/1/2021 3 9
bb 50 5 1/1/2021 5 1
aa 90 5 1/7/2021 5 2
aa 100 10 1/7/2021 1 5
bb 80 10 1/7/2021 5 2
Desired
type en en2 date stat1 stat2
aa 23 36 1/1/2021 2 3
bb 30 7.5 1/1/2021 4 5
aa 95 7.5 1/7/2021 3 3.5
bb 80 10 1/7/2021 5 2
Doing
grouped = final.groupby(['date'],['type']) \
.agg({'en':'mean', 'en2':'mean','stat1':'mean','stat2':'mean'})
I am getting a typeError. - Unhashable list I am researching. Any suggestion is appreciated.
Upvotes: 1
Views: 55
Reputation: 120429
Try:
grouped = final.groupby(['date', 'type'], as_index=False) \
.agg({'type': 'first', 'en': 'mean', 'en2': 'mean',
'date': 'first', 'stat1': 'mean', 'stat2': 'mean'})
print(grouped)
# Output
type en en2 date stat1 stat2
0 aa 23.333333 36.666667 1/1/2021 2.0 2.333333
1 bb 30.000000 7.500000 1/1/2021 4.0 5.000000
2 aa 95.000000 7.500000 1/7/2021 3.0 3.500000
3 bb 80.000000 10.000000 1/7/2021 5.0 2.000000
Upvotes: 1
Reputation: 347
grouped = final[['date', 'type', 'en',
'en2','stat1','stat2']].groupby(['date', 'type'],
as_index=False, dropna=False).sum()
Upvotes: 1