Zez
Zez

Reputation: 135

Barplot from a dataframe using a column to set the bar colors

I have a dataframe such as (this is a subset of the dataframe):

    Species     Pathway        Number of Gene Families
0   Glovio      ABC                    0.5
1   Glovio      ABC/Synthase           1.0
2   Glovio      Synthase               0.0
3   Glovio      Wzy                   10.0
4   Glovio      Wzy/ABC                0.0
5   n2          ABC                    2.0
6   n2          ABC/Synthase           0.0
7   n2          Synthase               13.0
8   n2          Wzy                    7.0
9   n2          Wzy/ABC                0.0
10  Glokil      ABC                    2.0
11  Glokil      ABC/Synthase           1.0
12  Glokil      Synthase               0.0
13  Glokil      Wzy                    4.0
14  Glokil      Wzy/ABC                0.0

I want to plot a stacked bar plot where each bar corresponds to the species (x-axis). The y-axis would display the Number of Gene Families, colour-coded by the Pathway.

I have tried simple things, such as:

df[['Pathway']].plot(kind='bar', stacked=True)

But I get an error stating that:

Empty 'DataFrame': no numeric data to plot

Any ideas?

Thank you!

Upvotes: 0

Views: 117

Answers (3)

Quang Hoang
Quang Hoang

Reputation: 150735

I would do a set_index().unstack():

(df.set_index(['Species','Pathway'])
   ['Number of Gene Families']
   .unstack('Pathway')
   .plot.bar(stacked=True)
)

Output:

enter image description here

Upvotes: 0

Ben.T
Ben.T

Reputation: 29635

you can do it after reshaping the dataframe like:

df.groupby(['Species', 'Pathway'])['Number of Gene Families'].sum()\
  .unstack().plot(kind='bar', stacked=True)

enter image description here

Or with a pivot_table same result:

df.pivot(index='Species', columns='Pathway', values='Number of Gene Families')\
  .plot(kind='bar', stacked=True )

Upvotes: 1

yatu
yatu

Reputation: 88236

In searborn you can specify a hue variable when using sns.barplot, which will determine the color of the bars according to the different levels:

sns.barplot(data=df, x='Species', y='NumberofGeneFamilies', hue='Pathway')

enter image description here

Upvotes: 1

Related Questions