Haris
Haris

Reputation: 19

Pandas/Matplotlib - How to sort values in a plot

I have a dataset with columns "Company" and "Outcome".

Company is a company name, Outcome is either success or failure

I have created the following graph with this code. How can I sort this so that the top has the most "Outcomes" (i.e. combined Success and Failure) and it goes down in a descending manner?

The code I used is:

df = pd.DataFrame({'Company': ['C', 'A', 'A', 'B', 'B', 'B', 'B'], 'Outcome': ['Success', 'Success', 'Failure', 'Failure', 'Success', 'Failure','Success' ]})

df.groupby(['Company', 'Outcome']).size().unstack().plot(kind = 'barh', stacked=True)

plt.show()

Additionally, including sort_values() after size() does not appear to have any effect, so clearly I am using it wrong. Any advice?

Bar Graph

Upvotes: 0

Views: 2646

Answers (1)

r-beginners
r-beginners

Reputation: 35240

As noted in the comments, sorting at the point of data will sort from the bottom to the largest, so the y-axis should be in reverse order.

Added: Need to sort by total value

df2 = pd.DataFrame({'Company': ['C', 'A', 'A', 'B', 'B', 'B', 'B', 'D', 'D', 'D', 'D', 'D', 'D'],
                    'Outcome': ['Success', 'Success', 'Failure', 'Failure', 'Success', 'Failure','Success','Success','Success','Success','Success','Success','Success' ]})

df3 = df2.groupby(['Company', 'Outcome']).size().unstack()
df3['Total'] = df3['Failure'].fillna(0)+ df3['Success']
df3.sort_values('Total', ascending=False, inplace=True)
ax = df3[['Failure','Success']].plot(kind='barh', stacked=True)
ax.invert_yaxis()

enter image description here

Upvotes: 1

Related Questions