Ramiro
Ramiro

Reputation: 43

Python Plotnine - Create a stacked bar chart

I've been trying to draw a stacked bar chart using plotnine. This graphic represents the end of month inventory within the same "Category". The "SubCategory" its what should get stacked.

I've built a pandas dataframe from a query to a database. The query retrieves the sum(inventory) for each "subcategory" within a "category" in a date range.

This is the format of the DataFrame:

     SubCategory1    SubCategory2    SubCategory3  ....   Dates
0      1450.0            130.5            430.2    ....  2019/Jan 
1      1233.2           1000.0             13.6    ....  2019/Feb
2      1150.8            567.2            200.3    ....  2019/Mar

Dates should be in the X axis, and Y should be determined by the sum of "SubCategory1" + "SubCategory2" + "SubCategory3" and being color distinguishable.

I tried this because I thought it made sense but had no luck:

g = ggplot(df)    
for key in subcategories: 
    g = g + geom_bar(aes(x='Dates', y=key), stat='identity', position='stack')  

Where subcategories is a dictionary with the SubCategories name.

Maybe the format of the dataframe is not ideal. Or I don't know how to properly use it with plotnine/ggplot.

Thanks for the help.

Upvotes: 4

Views: 4651

Answers (2)

has2k1
has2k1

Reputation: 2375

You need the data in tidy format

from io import StringIO
import pandas as pd
from plotnine import *
from mizani.breaks import date_breaks

io = StringIO("""
SubCategory1    SubCategory2    SubCategory3     Dates
1450.0            130.5            430.2      2019/Jan 
1233.2           1000.0             13.6      2019/Feb
1150.8            567.2            200.3      2019/Mar
""")

data = pd.read_csv(io, sep='\s+', parse_dates=[3])

# Make the data tidy
df = pd.melt(data, id_vars=['Dates'], var_name='categories')

"""
       Dates    categories   value
0 2019-01-01  SubCategory1  1450.0
1 2019-02-01  SubCategory1  1233.2
2 2019-03-01  SubCategory1  1150.8
3 2019-01-01  SubCategory2   130.5
4 2019-02-01  SubCategory2  1000.0
5 2019-03-01  SubCategory2   567.2
6 2019-01-01  SubCategory3   430.2
7 2019-02-01  SubCategory3    13.6
8 2019-03-01  SubCategory3   200.3
"""

(ggplot(df, aes('Dates', 'value', fill='categories'))
 + geom_col()
 + scale_x_datetime(breaks=date_breaks('1 month'))
)

Result Plot

Upvotes: 3

Quang Hoang
Quang Hoang

Reputation: 150785

Do you really need to use plotnine? You can do it with just:

df.plot.bar(x='Dates', stacked=True)

Output:

enter image description here

Upvotes: 1

Related Questions