R. Cox
R. Cox

Reputation: 879

Stacked Bar Graph with empty columns

I'm trying to do a stacked bar graph. I can do a basic bar graph:

df = pd.DataFrame({'Y': [1,1,1,1,1,2,3,2],
                   'X': [2,2,2,2,3,3,3,4]})

Y_1 = df.loc[df['Y'] == 1]
Y_2 = df.loc[df['Y'] == 2]

Count_0 = df.groupby(['X']).size().to_frame('Count').reset_index()
Count_1 = Y_1.groupby(['X']).size().to_frame('Count').reset_index()
Count_2 = Y_2.groupby(['X']).size().to_frame('Count').reset_index()

height_0 = Count_0.Count
height_1 = Count_1.Count
height_2 = Count_2.Count
bars     = Count_0.X

fig, (ax1) = plt.subplots(1,1);

y_pos = np.arange(len(bars))

p1 = plt.bar(y_pos, height_0) 

for item in ([ax1.title, ax1.xaxis.label, ax1.yaxis.label] +
             ax1.get_xticklabels() + ax1.get_yticklabels()):
    item.set_fontsize(22)

plt.xlabel('X')
plt.ylabel('Count')
plt.xticks(y_pos, bars)
plt.yticks(np.arange(0, 4.1, 1))
fig = plt.gcf()
fig.set_size_inches(18.5, 10.5)
plt.show()
plt.clf()

enter image description here But when I try to stack it by class "Y":

p2 = plt.bar(y_pos, height_2, bottom = height_1)

I get:

ValueError: incompatible sizes: argument 'height' must be length 3 or scalar

I think that the problem might be that there are empty columns of Y = 2 and Y = 3 due to these classes not having any instances with X = 2. I would like X on the X axis and Y to be the colour please!

Upvotes: 1

Views: 411

Answers (2)

edesz
edesz

Reputation: 12406

If you can do this using a different Python plotting library, then here is an approach for a stacked bar chart using Altair - no groupby is needed

Import and setting

import altair as alt
alt.renderers.enable('notebook')

Stacked bar plot

alt.Chart(df).mark_bar().encode(
    alt.X('X:N', axis=alt.Axis(labelAngle=0, tickSize=10)),
    alt.Y('count(Y):Q', axis=alt.Axis(title='Total count')),
    color='Y:N'
).properties(
    width=350,
    height=350
).configure_axis(
    titleFontSize=14,
    labelFontSize=12
).configure_legend(
    titleFontSize=14,
    labelFontSize=12
)

Output

Output

Here are the links to customize the

Initial Attempt

Deleted due to wrong interpretation of OP question.

Upvotes: 0

Scott Boston
Scott Boston

Reputation: 153460

IIUC, you want this:

df = pd.DataFrame({'Y': [1,1,1,1,1,2,3,2],
                   'X': [2,2,2,2,3,3,3,4]})
df.groupby(['X','Y'])['Y'].count().unstack().plot.bar(stacked=True)

Output:

enter image description here

Upvotes: 1

Related Questions