Reputation: 353
I have created a grouped barplot which expresses the percentage of cases won per time interval. I would like to annotate the barplot with the number of cases won per time interval.
Here is my code:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame({
'years': ['1994-1998','1999-2003','2004-2008','2009-2013','2013-2017','2018-2022'],
'Starfish number of cases': [10,8,31,12,2,3],
'Starfish percent of wins': [0,0.25,0.225806451612903,0.416666666666666,1,0],
'Jellyfish number of cases':[597,429,183,238,510,595],
'Jellyfish percent of wins':[0.362646566164154,0.273892773892773,0.423497267759562,0.478991596638655,0.405882352941176,0.408403361344537],
})
df = pd.melt(df, id_vars=['years'], value_vars=['Starfish percent of wins', 'Jellyfish percent of wins'])
sns.set_theme(style="whitegrid")
# Initialize the matplotlib figure
f, ax = plt.subplots(figsize=(30, 15))
sns.barplot(x="years", y="value", hue='variable', data=df)
for p in ax.patches:
ax.annotate(str(p.get_height()), (p.get_x() * 1.005, p.get_height() * 1.005))
I have tried to include the number of cases in the melt function
(i.e. df = pd.melt(df, id_vars=['years'], value_vars=['Starfish number of cases','Jellyfish number of cases','Starfish percent of wins', 'Jellyfish percent of wins'])
) but this adds additional bars representing the total number of cases.
I tried to modify the answer here by adding the lines below, but the results show percentage annotations, not number of cases:
for p,years in zip(ax.patches, df['Starfish number of cases','Jellyfish number of cases']):
ax.annotate(years, xy=(p.get_x()+p.get_width()/2, p.get_height()),
ha='center', va='bottom')
There's an answer here, but it's complicated. There must be a simpler way?
Upvotes: 0
Views: 47
Reputation: 80279
The approach below adds the 'number of cases' columns to be included in the melt. Then, the bar plot is created with only the percentages.
The bars are stored in ax.containers
. There are 2 containers, one for each hue value. ax.bar_label()
can get a container and a list of labels as input.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
df_orig = pd.DataFrame({
'years': ['1994-1998', '1999-2003', '2004-2008', '2009-2013', '2013-2017', '2018-2022'],
'Starfish number of cases': [10, 8, 31, 12, 2, 3],
'Starfish percent of wins': [0, 0.25, 0.2258064516, 0.41666666666, 1, 0],
'Jellyfish number of cases': [597, 429, 183, 238, 510, 595],
'Jellyfish percent of wins': [0.3626465661, 0.2738927739, 0.4234972677, 0.4789915966, 0.4058823529, 0.4084033613],
})
df = pd.melt(df_orig, id_vars=['years'],
value_vars=['Starfish number of cases', 'Starfish percent of wins',
'Jellyfish number of cases', 'Jellyfish percent of wins'])
sns.set_theme(style="whitegrid")
# Initialize the matplotlib figure
fig, ax = plt.subplots(figsize=(12, 5))
sns.barplot(x="years", y="value", hue='variable',
hue_order=['Starfish percent of wins', 'Jellyfish percent of wins'], data=df, ax=ax)
for bargroup, variable in zip(ax.containers, ['Starfish number of cases', 'Jellyfish number of cases']):
labels = ['' if val == 0.0 else f'{val:.0f}' for val in df[df['variable'] == variable]['value']]
ax.bar_label(bargroup, labels)
sns.despine()
Upvotes: 1