A7X
A7X

Reputation: 129

How do I plot a categorical bar chart with different classes for each category in Matplotlib?

I was trying to reproduce this plot with Matplotlib:

enter image description here

So, by looking at the documentation, I found out that the closest thing is a grouped bar chart. The problem is that I have a different number of "bars" for each category (subject, illumination, ...) compared to the example provided by matplotlib that instead only has 2 classes (M, F) for each category (G1, G2, G3, ...). I don't know exactly from where to start, does anyone here has any clue? I think in this case the trick they made to specify bars location:

x = np.arange(len(labels))  # the label locations
width = 0.35  # the width of the bars
fig, ax = plt.subplots()
rects1 = ax.bar(x - width/2, men_means, width, label='Men')
rects2 = ax.bar(x + width/2, women_means, width, label='Women')

does not work at all as in the second class (for example) there is a different number of bars. It would be awesome if anyone could give me an idea. Thank you in advance!

Upvotes: 0

Views: 1771

Answers (1)

JohanC
JohanC

Reputation: 80574

Supposing the data resides in a dataframe, the bars can be generated by looping through the categories:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# first create some test data, similar in structure to the question's
categories = ['Subject', 'Illumination', 'Location', 'Daytime']
df = pd.DataFrame(columns=['Category', 'Class', 'Value'])
for cat in categories:
    for _ in range(np.random.randint(2, 7)):
        df = df.append({'Category': cat,
                        'Class': "".join(np.random.choice([*'tuvwxyz'], 10)),
                        'Value': np.random.uniform(10, 17)}, ignore_index=True)


fig, ax = plt.subplots()
start = 0 # position for first label
gap = 1 # gap between labels
labels = [] # list for all the labels
label_pos = np.array([]) # list for all the label positions

# loop through the categories of the dataframe
# provide a list of colors (at least as long as the expected number of categories)
for (cat, df_cat), color in zip(df.groupby('Category', sort=False), ['navy', 'orange'] * len(df)):
    num_in_cat = len(df_cat)
    # add a text for the category, using "axes coordinates" for the y-axis
    ax.text(start + num_in_cat / 2, 0.95, cat, ha='center', va='top', transform=ax.get_xaxis_transform())
    # positions for the labels of the current category
    this_label_pos = np.arange(start, start + num_in_cat)
    # create bars at the desired positions
    ax.bar(this_label_pos, df_cat['Value'], color=color)
    # store labels and their positions
    labels += df_cat['Class'].to_list()
    label_pos = np.append(label_pos, this_label_pos)
    start += num_in_cat + gap
# set the positions for the labels
ax.set_xticks(label_pos)
# set the labels
ax.set_xticklabels(labels, rotation=30)
# optionally set a new lower position for the y-axis
ax.set_ylim(ymin=9)
# optionally reduce the margin left and right
ax.margins(x=0.01)
plt.tight_layout()
plt.show()

bar plot with different categories

Upvotes: 3

Related Questions