Reputation: 879
I'm trying to do a stacked bar graph. I can do a basic bar graph:
df = pd.DataFrame({'Y': [1,1,1,1,1,2,3,2],
'X': [2,2,2,2,3,3,3,4]})
Y_1 = df.loc[df['Y'] == 1]
Y_2 = df.loc[df['Y'] == 2]
Count_0 = df.groupby(['X']).size().to_frame('Count').reset_index()
Count_1 = Y_1.groupby(['X']).size().to_frame('Count').reset_index()
Count_2 = Y_2.groupby(['X']).size().to_frame('Count').reset_index()
height_0 = Count_0.Count
height_1 = Count_1.Count
height_2 = Count_2.Count
bars = Count_0.X
fig, (ax1) = plt.subplots(1,1);
y_pos = np.arange(len(bars))
p1 = plt.bar(y_pos, height_0)
for item in ([ax1.title, ax1.xaxis.label, ax1.yaxis.label] +
ax1.get_xticklabels() + ax1.get_yticklabels()):
item.set_fontsize(22)
plt.xlabel('X')
plt.ylabel('Count')
plt.xticks(y_pos, bars)
plt.yticks(np.arange(0, 4.1, 1))
fig = plt.gcf()
fig.set_size_inches(18.5, 10.5)
plt.show()
plt.clf()
But when I try to stack it by class "Y":
p2 = plt.bar(y_pos, height_2, bottom = height_1)
I get:
ValueError: incompatible sizes: argument 'height' must be length 3 or scalar
I think that the problem might be that there are empty columns of Y = 2 and Y = 3 due to these classes not having any instances with X = 2. I would like X on the X axis and Y to be the colour please!
Upvotes: 1
Views: 411
Reputation: 12406
If you can do this using a different Python plotting library, then here is an approach for a stacked bar chart using Altair - no groupby
is needed
Import and setting
import altair as alt
alt.renderers.enable('notebook')
Stacked bar plot
alt.Chart(df).mark_bar().encode(
alt.X('X:N', axis=alt.Axis(labelAngle=0, tickSize=10)),
alt.Y('count(Y):Q', axis=alt.Axis(title='Total count')),
color='Y:N'
).properties(
width=350,
height=350
).configure_axis(
titleFontSize=14,
labelFontSize=12
).configure_legend(
titleFontSize=14,
labelFontSize=12
)
Output
Here are the links to customize the
Initial Attempt
Deleted due to wrong interpretation of OP question.
Upvotes: 0
Reputation: 153460
IIUC, you want this:
df = pd.DataFrame({'Y': [1,1,1,1,1,2,3,2],
'X': [2,2,2,2,3,3,3,4]})
df.groupby(['X','Y'])['Y'].count().unstack().plot.bar(stacked=True)
Output:
Upvotes: 1