DimKoim
DimKoim

Reputation: 1044

Ascending order of bars in seaborn barplot

I have the following dataframe

   Class    Age Percentage
0   2004    3   43.491170
1   2004    2   29.616607
2   2004    4   13.838925
3   2004    6   10.049712
4   2004    5   2.637445
5   2004    1   0.366142
6   2005    2   51.267369
7   2005    3   19.589268
8   2005    6   13.730432
9   2005    4   11.155305
10  2005    5   3.343524
11  2005    1   0.913590
12  2005    9   0.000511

I would like to make a bar plot using seaborn where in the y-axis is the 'Percentage', in the x-axis is the 'Class' and label them using the 'Age' column. I would also like to arrange the bars in descending order, i.e. from the bigger to the smaller bar.

In order to do that I thought of the following: I will change the hue_order parameter based on the order of the 'Percentage' variable. For example, if I sort the 'Percentage' column in descending order for the Class == 2004, then the hue_order = [3, 2, 4, 6, 5, 1].

Here is my code:

import matplotlib.pyplot as plt
import seaborn as sns

def hue_order():
    for cls in dataset.Class.unique():
        temp_df = dataset[dataset['Class'] == cls]
        order = temp_df.sort_values('Percentage', ascending = False)['Age']  
    return order

sns.barplot(x="Class", y="Percentage", hue = 'Age', 
                 hue_order= hue_order(),  
                 data=dataset)
plt.show()

However, the bars are in descending order only for the Class == 2005. Any help?

In my question, I am using the hue parameter, thus, it is not a duplicate as proposed.

Result

Upvotes: 2

Views: 8233

Answers (1)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339240

The seaborn hue parameter adds another dimension to the plot. The hue_order determines in which order this dimension is handled. However you cannot split that order. This means you may well change the order such that Age == 2 is in the third place in the plot. But you cannot change it partially, such that in some part it is in the first and in some other it'll be in the third place.

In order to achieve what is desired here, namely to use different orders of the auxilary dimensions within the same axes, you need to handle this manually.

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

df = pd.DataFrame({"Class" : [2004]*6+[2005]*7,
                   "Age" : [3,2,4,6,5,1,2,3,6,4,5,1,9],
                   "Percentage" : [50,40,30,20,10,30,20,35,40,50,45,30,15]})

def sortedgroupedbar(ax, x,y, groupby, data=None, width=0.8, **kwargs):
    order = np.zeros(len(data))
    df = data.copy()
    for xi in np.unique(df[x].values):
        group = data[df[x] == xi]
        a = group[y].values
        b = sorted(np.arange(len(a)),key=lambda x:a[x],reverse=True)
        c = sorted(np.arange(len(a)),key=lambda x:b[x])
        order[data[x] == xi] = c   
    df["order"] = order
    u, df["ind"] = np.unique(df[x].values, return_inverse=True)
    step = width/len(np.unique(df[groupby].values))
    for xi,grp in df.groupby(groupby):
        ax.bar(grp["ind"]-width/2.+grp["order"]*step+step/2.,
               grp[y],width=step, label=xi, **kwargs)
    ax.legend(title=groupby)
    ax.set_xticks(np.arange(len(u)))
    ax.set_xticklabels(u)
    ax.set_xlabel(x)
    ax.set_ylabel(y)


fig, ax = plt.subplots()    
sortedgroupedbar(ax, x="Class",y="Percentage", groupby="Age", data=df)
plt.show()

enter image description here

Upvotes: 7

Related Questions