Reputation: 315
I have a simple pandas dataframe of 3 columns (month, amount, category) where each row represent an expense of a certain category:
import pandas as pd
d = {'Month': ['Jan', 'Jan', 'Jan', 'Feb', 'Feb', 'Mar', 'Mar', 'Mar', 'Mar'], 'Amount': [5, 65, 29, 200, 28.5, 12, 4, 100, 21], 'Category': ['Travel', 'Food', 'Dentist', 'Dentist', 'Food', 'Travel', 'Food', 'Sport', 'Sport']}
df = pd.DataFrame(df)
I'd like to create a seaborn bar plot where each bar represent the total amount of expenses per month, and each bar is split into different color, where each hue represents the total expense of a particular category on that month.
I was able to achieve the result using a pretty convoluted method and the plotting using matplotlib:
df = df.groupby(['Month', 'Category']).sum()
df.reset_index(inplace=True)
pivot_df = df.pivot(index='Month', columns='Category', values='Amount')
df.plot.bar(stacked=True, colormap='tab20')
But this method gives error when trying to use seaborn, and it seems unnecessary complicated.
Is there a better way to achieve the desired result?
Upvotes: 1
Views: 3929
Reputation: 59519
Your initial method is complicated because you have unnecessary steps. You groupby
and pivot
, but the same aggregation and reshaping can be done at once with pivot_table
. From your initial DataFrame:
df_pivot = pd.pivot_table(df, index='Month', columns='Category', values='Amount', aggfunc='sum')
df_pivot.plot.bar(stacked=True, colormap='tab20')
As for using seaborn
, I wouldn't. They don't really support a stacked barplot, and all of their examples which look like stacked plots only have two categories where they plot the total, and then overlay one group (giving the impression it's stacked). But this method doesn't easily extend to more than 2 groups.
But if you want that seaborn feel you can use their defaults.
import seaborn as sns
sns.set()
df_pivot = pd.pivot_table(df, index='Month', columns='Category', values='Amount', aggfunc='sum')
df_pivot.plot.bar(stacked=True)
Upvotes: 5