DrakeMurdoch
DrakeMurdoch

Reputation: 859

Pandas stacked bar creating many individual plots with incorrect bottom values

Given a Dataframe (this is generated from a csv that contains the names and orders and updated everyday):

# Note that this is just an example df and the real can have N names in n shuffled orders
df = pd.read_csv('names_and_orders.csv', header=0)
print(df)
    names    order
0   mike     0
1   jo       1
2   mary     2
3   jo       0
4   mike     1 
5   mary     2
6   mike     0 
7   mary     1
8   jo       2

I am turning this into a stacked bar plot using pandas' stacked bar functionality and a for loop, as shown below.

# Create list of names from original df
names1 = df['names'].drop_duplicates().tolist()
N = len(names1)
viridis = cm.get_cmap('viridis', 100)

# Get count of each name at each order
df_count = df_o.groupby(['order', 'names']).size().reset_index(name='count')

# Plot count vs order in a stacked bar with the label as the current name
for i in range(len(names1)):
    values = list(df_count[df_count['names'] == names1[i]].loc[:, 'count'])
    df_count[df_count['names'] == names1[i]].plot.bar(x='order', y='count', color=viridis(i / N), stacked=True,
                                                      bottom=values, edgecolor='black', label=names1[i])
    values += values
# Add ticks, labels, title, and legend to plot
plt.xticks(np.arange(0, N, step=1))
plt.xlabel('Order')
plt.yticks(np.arange(0, df_count['count'].max(), step=1))
plt.ylabel('Count')
plt.title('How many times each person has been at each order number')
plt.legend()
plt.show()

Given this code, there are two main issues I am having:

  1. It is currently plotting every name on a different figure instead of making one stacked bar plot
  2. I don't believe the values use for the bottom kwarg is correct

Upvotes: 1

Views: 59

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150765

I think you're overthinking this. Just unstack the groupby and plot:

df_count = df.groupby(['order', 'names']).size().unstack('names')
df_count.plot.bar(stacked=True)

Output:

enter image description here

Upvotes: 2

Related Questions