Reputation: 1521
I have the following dataframe and I am trying to create a stacked bar plot
import os
from pprint import pprint
import matplotlib.pyplot as plt
import pandas as pd
def classify_data():
race = ['race1','race1','race1','race1','race2','race2','race2', 'race2']
qualifier = ['last','first','first','first','last','last','first','first']
participant = ['rat','rat','cat','cat','rat','dog','dog','dog']
df = pd.DataFrame(
{'race':race,
'qualifier':qualifier,
'participant':participant
}
)
pprint(df)
df2 = df.groupby(['race','qualifier'])['race'].count().unstack('qualifier').fillna(0)
df2[['first','last']].plot(kind='bar', stacked=True)
plt.show()
classify_data()
I could manage to obtain the following plot. But , I want to create two plots out of my dataframe
One plot containing the following data for the qualifier 'last'
Race1 rat 1
Race1 cat 0
Race1 dog 0
Race2 rat 1
Race2 dog 1
Race2 cat 0
So the first bar plot would have 2 bars and each bar coded with a different color for the count of participant
Likewise a second plot for qualifier 'first'
EDIT:
Race1 rat 1
Race1 cat 2
Race1 dog 0
Race2 rat 0
Race2 dog 2
Race2 cat 0
From the original dataframe , I have to create the above two dataframe for creating the stacked plots
I am not sure how to use the groupby function and get the count of 'participant' for each 'qualifier' for a given 'race'
EDIT 2 : For qualifier 'last' the desired plot would look like( blue for rat , red for dog).
For qualifier 'first'
Could someone suggest me on how to proceed from here?
Upvotes: 3
Views: 5564
Reputation: 150815
IIUC, this is what you want:
df2 = (df.groupby(['race','qualifier','participant'])
.size()
.unstack(level=-1)
.reset_index()
)
fig,axes = plt.subplots(1,2,figsize=(12,6),sharey=True)
for ax,q in zip(axes.ravel(),['first','last']):
tmp_df = df2[df2.qualifier.eq(q)]
tmp_df.plot.bar(x='race', ax=ax, stacked=True)
Output:
Upvotes: 2