Reputation: 1044
I have the following dataframe
Class Age Percentage
0 2004 3 43.491170
1 2004 2 29.616607
2 2004 4 13.838925
3 2004 6 10.049712
4 2004 5 2.637445
5 2004 1 0.366142
6 2005 2 51.267369
7 2005 3 19.589268
8 2005 6 13.730432
9 2005 4 11.155305
10 2005 5 3.343524
11 2005 1 0.913590
12 2005 9 0.000511
I would like to make a bar plot using seaborn where in the y-axis is the 'Percentage', in the x-axis is the 'Class' and label them using the 'Age' column. I would also like to arrange the bars in descending order, i.e. from the bigger to the smaller bar.
In order to do that I thought of the following: I will change the hue_order
parameter based on the order of the 'Percentage' variable. For example, if I sort the 'Percentage' column in descending order for the Class == 2004
, then the hue_order = [3, 2, 4, 6, 5, 1]
.
Here is my code:
import matplotlib.pyplot as plt
import seaborn as sns
def hue_order():
for cls in dataset.Class.unique():
temp_df = dataset[dataset['Class'] == cls]
order = temp_df.sort_values('Percentage', ascending = False)['Age']
return order
sns.barplot(x="Class", y="Percentage", hue = 'Age',
hue_order= hue_order(),
data=dataset)
plt.show()
However, the bars are in descending order only for the Class == 2005
. Any help?
In my question, I am using the hue
parameter, thus, it is not a duplicate as proposed.
Upvotes: 2
Views: 8233
Reputation: 339240
The seaborn hue
parameter adds another dimension to the plot. The hue_order
determines in which order this dimension is handled. However you cannot split that order. This means you may well change the order such that Age == 2
is in the third place in the plot. But you cannot change it partially, such that in some part it is in the first and in some other it'll be in the third place.
In order to achieve what is desired here, namely to use different orders of the auxilary dimensions within the same axes, you need to handle this manually.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = pd.DataFrame({"Class" : [2004]*6+[2005]*7,
"Age" : [3,2,4,6,5,1,2,3,6,4,5,1,9],
"Percentage" : [50,40,30,20,10,30,20,35,40,50,45,30,15]})
def sortedgroupedbar(ax, x,y, groupby, data=None, width=0.8, **kwargs):
order = np.zeros(len(data))
df = data.copy()
for xi in np.unique(df[x].values):
group = data[df[x] == xi]
a = group[y].values
b = sorted(np.arange(len(a)),key=lambda x:a[x],reverse=True)
c = sorted(np.arange(len(a)),key=lambda x:b[x])
order[data[x] == xi] = c
df["order"] = order
u, df["ind"] = np.unique(df[x].values, return_inverse=True)
step = width/len(np.unique(df[groupby].values))
for xi,grp in df.groupby(groupby):
ax.bar(grp["ind"]-width/2.+grp["order"]*step+step/2.,
grp[y],width=step, label=xi, **kwargs)
ax.legend(title=groupby)
ax.set_xticks(np.arange(len(u)))
ax.set_xticklabels(u)
ax.set_xlabel(x)
ax.set_ylabel(y)
fig, ax = plt.subplots()
sortedgroupedbar(ax, x="Class",y="Percentage", groupby="Age", data=df)
plt.show()
Upvotes: 7