Reputation: 10051
I have a dataframe as follows:
index col1 col2 col3 col4 col5
0 0 Week_1 James John 1 when and why?
1 1 Week_1 James John 3 when and why? How?
2 2 Week_2 James John 2 How far is it? Are you going?
3 3 Week_2 Mark Jim 3 Do you know when?
4 4 Week_2 Andrew Simon 3 What time?
5 5 Week_2 Andrew Simon 6 What time?
How could I groupby col2
and col3
then calculate mean and count numbers of col2
and col3
?
df.groupby(['col2','col3'], as_index=False).agg({'col4':'mean'}).reset_index()
Output:
index col2 col3 col4
0 0 Andrew Simon 4.5
1 1 James John 2.0
2 2 Mark Jim 3.0
df.groupby(['col2','col3']).size().reset_index()
Output:
col2 col3 0
0 Andrew Simon 2
1 James John 3
2 Mark Jim 1
How could I get result like this? Thanks.
index col2 col3 mean count
0 0 James John 2.0 3
1 3 Mark Jim 3.0 1
2 4 Andrew Simon 4.5 2
Upvotes: 0
Views: 667
Reputation: 19957
You can use groupby and agg(may need to use pandas 0.25+).
(
df.groupby(['col2','col3'])
.agg(index=('index', 'first'),
mean=('col4', 'mean'),
count=('col4', 'size'))
.reset_index()
.sort_values(by='index')
)
col2 col3 index mean count
1 James John 0 2.0 3
2 Mark Jim 3 3.0 1
0 Andrew Simon 4 4.5 2
Upvotes: 3