Reputation: 460
In [167]:
df
Out[167]:
Gender University
0 Male A
1 Female B
2 Male C
3 Male D
4 Male E
5 Female A
6 Female B
7 Female C
8 Female D
9 Female E
In [168]:
df.groupby(['University','Gender'])['Gender'].size().unstack('Gender').fillna(0)
Out[168]:
Now, I would like to sort by Female and Male from highest to lowest so that when I bar plot, It'll be in a descending order. I tried many ways but to no avail.
In my last attempt I tried:
df.groupby(['University','Gender'])['Gender'].size().unstack('Gender').fillna(0).sort_values(ascending=False)
TypeError: sort_values() missing 1 required positional argument: 'by'
Any suggestions?
Upvotes: 1
Views: 1766
Reputation: 862611
You can sort by one or another column:
print (df)
Gender University
0 Male A
1 Female B
3 Male D
4 Male E
5 Female A
2 Male C
3 Male D
4 Male E
5 Female A
6 Female B
7 Female C
8 Female D
4 Male E
5 Female A
6 Female B
3 Male D
4 Male E
5 Female A
7 Female C
8 Female D
9 Female E
df1 = df.groupby(['University','Gender'])['Gender']
.size()
.unstack('Gender', fill_value=0)
.sort_values(by='Female', ascending=False)
print (df1)
Gender Female Male
University
A 4 1
B 3 0
C 2 1
D 2 3
E 1 4
df1.plot.bar()
df2 = df.groupby(['University','Gender'])['Gender']
.size()
.unstack('Gender', fill_value=0)
.sort_values(by='Male', ascending=False)
print (df2)
Gender Female Male
University
E 1 4
D 2 3
A 4 1
C 2 1
B 3 0
df2.plot.bar()
If sort by both columns sorting of second column sort only duplicates (D
, C
columns):
df3 = df.groupby(['University','Gender'])['Gender']
.size()
.unstack('Gender', fill_value=0)
.sort_values(by=['Female', 'Male'], ascending=False)
print (df3)
df3.plot.bar()
Upvotes: 1