Reputation: 859
Given a Dataframe (this is generated from a csv that contains the names and orders and updated everyday):
# Note that this is just an example df and the real can have N names in n shuffled orders
df = pd.read_csv('names_and_orders.csv', header=0)
print(df)
names order
0 mike 0
1 jo 1
2 mary 2
3 jo 0
4 mike 1
5 mary 2
6 mike 0
7 mary 1
8 jo 2
I am turning this into a stacked bar plot using pandas' stacked bar functionality and a for loop, as shown below.
# Create list of names from original df
names1 = df['names'].drop_duplicates().tolist()
N = len(names1)
viridis = cm.get_cmap('viridis', 100)
# Get count of each name at each order
df_count = df_o.groupby(['order', 'names']).size().reset_index(name='count')
# Plot count vs order in a stacked bar with the label as the current name
for i in range(len(names1)):
values = list(df_count[df_count['names'] == names1[i]].loc[:, 'count'])
df_count[df_count['names'] == names1[i]].plot.bar(x='order', y='count', color=viridis(i / N), stacked=True,
bottom=values, edgecolor='black', label=names1[i])
values += values
# Add ticks, labels, title, and legend to plot
plt.xticks(np.arange(0, N, step=1))
plt.xlabel('Order')
plt.yticks(np.arange(0, df_count['count'].max(), step=1))
plt.ylabel('Count')
plt.title('How many times each person has been at each order number')
plt.legend()
plt.show()
Given this code, there are two main issues I am having:
values
use for the bottom
kwarg is correctUpvotes: 1
Views: 59
Reputation: 150765
I think you're overthinking this. Just unstack
the groupby and plot:
df_count = df.groupby(['order', 'names']).size().unstack('names')
df_count.plot.bar(stacked=True)
Output:
Upvotes: 2