Coldchain9
Coldchain9

Reputation: 1745

Annotating Pandas Barplot with an Additional Value Unrelated to the Plot's x and y Axis

I am producing a pandas barplot with raw counts represented by the plot, however I would like to annotate the bars with the pct of those counts as a whole. I have seen a lot of people using ax.patches methods to annotate but my values are unrelated to the get_height of the actual bars.

Here is some toy data. The plot produced will be the individual counts of the specific type. However, I want to add annotations above that specific bar that represent the pct total of that specific type to all types for that person's name.

Let me know if you need any more clarification.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt


d = {'ID': [1,1,1,2,2,3,3,3,4], 
     'name': ['bob','bob','bob','shelby','shelby','jordan','jordan','jordan','jeff'],
     'type': ['type1','type2','type4','type1','type6','type5','type8','type2',None]}
df: pd.DataFrame = pd.DataFrame(data=d)

df_pivot: pd.DataFrame = df.pivot_table(index='type', columns=['name'], values='ID', aggfunc={'ID': np.sum}).fillna(0)

# create percent totals of the specific type's row of the total
df_pivot['bob_pct_total']: pd.Series = (df_pivot['bob']/df_pivot['bob'].sum()).mul(100).round(1)
df_pivot['shelby_pct_total']: pd.Series = (df_pivot['shelby']/df_pivot['shelby'].sum()).mul(100).round(1)
df_pivot['jordan_pct_total']: pd.Series = (df_pivot['jordan']/df_pivot['jordan'].sum()).mul(100).round(1)
df_pivot.head(10)

name    bob jordan  shelby  bob_pct_total   shelby_pct_total    jordan_pct_total
type                        
type1   1.0   0.0      2.0           33.3               50.0                0.0
type2   1.0   3.0      0.0           33.3                0.0               33.3
type4   1.0   0.0      0.0           33.3                0.0                0.0
type5   0.0   3.0      0.0            0.0                0.0               33.3
type6   0.0   0.0      2.0            0.0               50.0                0.0
type8   0.0   3.0      0.0            0.0                0.0               33.3

fig, ax = plt.subplots(figsize=(15,15))
df_pivot.plot(kind='bar', y=['bob','jordan','shelby'], ax=ax)

Upvotes: 0

Views: 234

Answers (1)

JohanC
JohanC

Reputation: 80279

You can use the old approach, looping through the bars, using the height to position whatever text you want. Since matplotlib 3.4.0 there also is a new function bar_label that removes much of the boilerplate:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

d = {'ID': [1, 1, 1, 2, 2, 3, 3, 3, 4],
     'name': ['bob', 'bob', 'bob', 'shelby', 'shelby', 'jordan', 'jordan', 'jordan', 'jeff'],
     'type': ['type1', 'type2', 'type4', 'type1', 'type6', 'type5', 'type8', 'type2', None]}
df: pd.DataFrame = pd.DataFrame(data=d)

df_pivot: pd.DataFrame = df.pivot_table(index='type', columns=['name'], values='ID', aggfunc={'ID': np.sum}).fillna(0)

# create percent totals of the specific type's row of the total
df_pivot['bob_pct_total']: pd.Series = (df_pivot['bob'] / df_pivot['bob'].sum()).mul(100).round(1)
df_pivot['shelby_pct_total']: pd.Series = (df_pivot['shelby'] / df_pivot['shelby'].sum()).mul(100).round(1)
df_pivot['jordan_pct_total']: pd.Series = (df_pivot['jordan'] / df_pivot['jordan'].sum()).mul(100).round(1)

fig, ax = plt.subplots(figsize=(12, 5))
columns = ['bob', 'jordan', 'shelby']
df_pivot.plot(kind='bar', y=['bob', 'jordan', 'shelby'], rot=0, ax=ax)
for bars, col in zip(ax.containers, ['bob_pct_total', 'jordan_pct_total', 'shelby_pct_total']):
    ax.bar_label(bars, labels=['' if val == 0 else f'{val}' for val in df_pivot[col]])
plt.tight_layout()
plt.show()

bar_label on pandas barplot

PS: To skip labeling the first bars, you could experiment with:

for bars, col in zip(ax.containers, ['bob_pct_total', 'jordan_pct_total', 'shelby_pct_total']):
    labels=['' if val == 0 else f'{val}' for val in df_pivot[col]]
    labels[0] = ''
    ax.bar_label(bars, labels=labels)

Upvotes: 1

Related Questions