Fasteno
Fasteno

Reputation: 315

Stacked barplot in seaborn using numeric data as hue

I have a simple pandas dataframe of 3 columns (month, amount, category) where each row represent an expense of a certain category:

import pandas as pd

d = {'Month': ['Jan', 'Jan', 'Jan', 'Feb', 'Feb', 'Mar', 'Mar', 'Mar', 'Mar'], 'Amount': [5, 65, 29, 200, 28.5, 12, 4, 100, 21], 'Category': ['Travel', 'Food', 'Dentist', 'Dentist', 'Food', 'Travel', 'Food', 'Sport', 'Sport']}
df = pd.DataFrame(df)

I'd like to create a seaborn bar plot where each bar represent the total amount of expenses per month, and each bar is split into different color, where each hue represents the total expense of a particular category on that month.

I was able to achieve the result using a pretty convoluted method and the plotting using matplotlib:

df = df.groupby(['Month', 'Category']).sum()   
df.reset_index(inplace=True)
pivot_df = df.pivot(index='Month', columns='Category', values='Amount')
df.plot.bar(stacked=True, colormap='tab20')

But this method gives error when trying to use seaborn, and it seems unnecessary complicated.

Is there a better way to achieve the desired result?

Upvotes: 1

Views: 3929

Answers (1)

ALollz
ALollz

Reputation: 59519

Your initial method is complicated because you have unnecessary steps. You groupby and pivot, but the same aggregation and reshaping can be done at once with pivot_table. From your initial DataFrame:

df_pivot = pd.pivot_table(df, index='Month', columns='Category', values='Amount', aggfunc='sum')
df_pivot.plot.bar(stacked=True, colormap='tab20')

enter image description here


As for using seaborn, I wouldn't. They don't really support a stacked barplot, and all of their examples which look like stacked plots only have two categories where they plot the total, and then overlay one group (giving the impression it's stacked). But this method doesn't easily extend to more than 2 groups.

But if you want that seaborn feel you can use their defaults.

import seaborn as sns
sns.set()

df_pivot = pd.pivot_table(df, index='Month', columns='Category', values='Amount', aggfunc='sum')
df_pivot.plot.bar(stacked=True)

enter image description here

Upvotes: 5

Related Questions