Pandas - stacked bar chart with column values for stacking

Question

I have a data set with three sets of data: class type, neighborhood, and visibility.

I'm trying to create a bar chart that is both stacked and unstacked -- stacked by visibility, lined up by neighborhood. So basically, I'm looking for a combination of the unstacked-ness of this chart:

nbvis_gb = nbvis.sort_values(by=['visibility'],ascending=False).groupby(by='visibility',sort=False)
fig, ax = plt.subplots(nrows=1,ncols=2,figsize=(14,8),sharey=True)

for (i, j), ax,color in zip(nbvis_gb,ax.flatten(),colors_hood):
    print(j['class'].values)
    title = str(i)
    j.plot.bar(ax=ax,colors=colors_hood)
    ax.set_title(title, fontsize=20)
    #ax.set_ylim(0,1.05)
    ax.tick_params(labelsize=16)
    ax.set_xticklabels(j['class'].values)
    ax.legend_.remove()


ax.legend(loc=8,fontsize=20,ncol=4,bbox_to_anchor=(0,-.45))
fig.tight_layout(h_pad=2)
fig.suptitle('Visibility of containers by class and neighborhood',y=1.03,fontsize=24)

and the stacked-ness of this chart:

nbvis.unstack()['Neighborhood 1'].plot.bar(stacked=True)

Any help would be greatly appreciated!

Cheers, Elizabeth

Parfait · Accepted Answer

Consider melt and pivot_table of your dataframe to create a multi-index datafame aligned to your graph dimensions. Below outputs graph to screen and saves figure to png image in same folder using seaborn's color scheme. Of course adjust graph settings as needed.

Data

import numpy as np
import pandas as pd
from itertools import product

from matplotlib import pyplot as plt
import seaborn

np.random.seed(444)
df = pd.DataFrame(list(product(['bucket (1)', 'flower pot (2)', 'tarp (3)', 'trash (6)', 'toy (7)',
                                'piping/tubing (9)', 'other (10)'],
                               ['visible containers', 'partial or not visible containers'])), 
                  columns=['class', 'visibility']).assign(Neighborhood1 = abs(np.random.randn(14)),
                                                          Neighborhood2 = abs(np.random.randn(14)),
                                                          Neighborhood3 = abs(np.random.randn(14)),
                                                          Neighborhood4 = abs(np.random.randn(14)))

Graphing

seaborn.set()

def runplot(pvtdf):        
    fig, axes = plt.subplots(nrows=1, ncols=len(mdf['Neighborhood'].unique()))

    for i, n in enumerate(mdf['Neighborhood'].unique()):
        pvtdf.xs(n).plot(ax=axes[i], kind='bar', stacked=True, edgecolor='w', 
                figsize=(20,8), width=0.5, fontsize = 12, 
                title='{} - Visibility of containers 
 by class and neighborhood'.format(n))
        axes[i].title.set_size(16)

    plt.tight_layout()
    fig.savefig('Output.png')
    plt.show()
    plt.clf()

# MELT LONG
mdf = pd.melt(df, id_vars = ['class', 'visibility'], var_name='Neighborhood')

# PIVOT WIDE
pvtdf = mdf.pivot_table(index= ['Neighborhood', 'class'], columns='visibility', values='value')

runplot(pvtdf, n)

plt.close()

Output

Pandas - stacked bar chart with column values for stacking

Answers (2)

Related Questions