Reputation: 19
I have a dataset with columns "Company" and "Outcome".
Company is a company name, Outcome is either success or failure
I have created the following graph with this code. How can I sort this so that the top has the most "Outcomes" (i.e. combined Success and Failure) and it goes down in a descending manner?
The code I used is:
df = pd.DataFrame({'Company': ['C', 'A', 'A', 'B', 'B', 'B', 'B'], 'Outcome': ['Success', 'Success', 'Failure', 'Failure', 'Success', 'Failure','Success' ]})
df.groupby(['Company', 'Outcome']).size().unstack().plot(kind = 'barh', stacked=True)
plt.show()
Additionally, including sort_values()
after size()
does not appear to have any effect, so clearly I am using it wrong. Any advice?
Upvotes: 0
Views: 2646
Reputation: 35240
As noted in the comments, sorting at the point of data will sort from the bottom to the largest, so the y-axis should be in reverse order.
Added: Need to sort by total value
df2 = pd.DataFrame({'Company': ['C', 'A', 'A', 'B', 'B', 'B', 'B', 'D', 'D', 'D', 'D', 'D', 'D'],
'Outcome': ['Success', 'Success', 'Failure', 'Failure', 'Success', 'Failure','Success','Success','Success','Success','Success','Success','Success' ]})
df3 = df2.groupby(['Company', 'Outcome']).size().unstack()
df3['Total'] = df3['Failure'].fillna(0)+ df3['Success']
df3.sort_values('Total', ascending=False, inplace=True)
ax = df3[['Failure','Success']].plot(kind='barh', stacked=True)
ax.invert_yaxis()
Upvotes: 1