Reputation: 1374
I was just wondering how can I plotting this kind of chart and data in Seaborn:
data.csv:
1,2,3
2007,05,06
2007,05,06
2007,05,08
2007,05,08
2007,05,12
2007,05,15
2007,05,16
...
barchart which I wanted to plot:
I would appreciate if someone knows how to plot this kind of bar chart with Seaborn with my data.
Upvotes: 2
Views: 4763
Reputation: 2511
With pandas, you can make a stacked barplot with simply:
df.plot.bar(stacked=True)
So you just have to load or reshape your data first to have the months as columns and the year as index:
import numpy as np
import pandas as pd
import io
import matplotlib.pyplot as plt
import seaborn as sns
# sample data - 3rd column ignored
data = """
1,2,3
2007,05,06
2007,05,06
2007,06,08
2007,06,08
2007,05,06
2007,05,06
2007,06,08
2007,06,08
2007,05,06
2007,05,06
2007,06,08
2007,06,08
2008,03,12
2008,09,15
2008,02,16
2008,04,12
2008,05,15
2008,06,16
2008,03,12
2008,08,15
2008,02,16
2008,09,12
2008,05,15
2008,06,16
"""
# read data
df = pd.read_csv(io.StringIO(data), delimiter=',',names= ['year','count','ignore'],header=0,index_col='year')
nyears = len(np.unique(df.index.values))
df['month']=np.tile(np.arange(1,13),nyears)
#df.drop('ignore',1)
df.pivot(columns='month',values='count').plot.bar(stacked=True)
plt.show()
Upvotes: 0
Reputation: 15953
Based on the data you provided it wasn't possible to create the plot so I made a small sample to test it on. It was kind of long because you need to manipulate the data. The main idea is to understand that stacked bar plots are additive regular bar plots.
import pandas as pd
import io
import matplotlib.pyplot as plt
import seaborn as sns
# sample data - 3rd column ignored
data = """
year,month,count
2007,05,06
2007,05,06
2007,06,08
2007,06,08
2008,05,12
2008,05,15
2008,06,16
"""
# read data
df = pd.read_csv(io.StringIO(data), delimiter=',')
groups = df.groupby(['year','month'])
plot_data = groups.count() # obtain count of year and month multi-index
# additive barplot (May and June)
sns.barplot(x = plot_data.reset_index(level=1).index.unique(), y = plot_data.sum(axis=0, level=1)['count'], data=df , color = "red", ci=None)
# single bottom plot (in this case May only or "05")
bottom_plot = sns.barplot(x = plot_data.reset_index(level=1).index.unique(), y = plot_data.reorder_levels(['month','year']).loc[5]['count'], color = "#0000A3")
bottom_plot.set_ylabel("Count")
bottom_plot.set_xlabel("Year")
plt.show()
The process can be increased to include all 12 months, but I'm not aware of a single code that would do that without manipulating the data.
Upvotes: 2