Plot Histogram on different axes

Question

I am reading CSV file:

 Notation Level   RFResult   PRIResult   PDResult  Total Result
 AAA       1       1.23        0           2         3.23
 AAA       1       3.4         1           0         4.4
 BBB       2       0.26        1           1.42      2.68
 BBB       2       0.73        1           1.3       3.03
 CCC       3       0.30        0           2.73      3.03
 DDD       4       0.25        1           1.50      2.75
 AAA       5       0.25        1           1.50      2.75
 FFF       6       0.26        1           1.42      2.68
 ...
 ...

Here is the code

import pandas as pd
import matplotlib.pyplot as plt

df = pd.rad_csv('home\NewFiles\Files.csv')
Notation = df['Notation']
Level = df['Level']
RFResult = df['RFResult']
PRIResult = df['PRIResult']
PDResult = df['PDResult']

fig, axes = plt.subplots(nrows=7, ncols=1)
ax1, ax2, ax3, ax4, ax5, ax6, ax7 = axes.flatten()
n_bins = 13
ax1.hist(data['Total'], n_bins, histtype='bar') #Current this shows all Total Results in one plot 
plt.show()

I want to show each Level Total Result in each different axes like as follow:

ax1 will show Level 1 Total Result

ax2 will show Level 2 Total Result

ax3 will show Level 3 Total Result

ax4 will show Level 4 Total Result

ax5 will show Level 5 Total Result

ax6 will show Level 6 Total Result

ax7 will show Level 7 Total Result

JohanC · Accepted Answer

You can select a filtered part of a dataframe just by indexing: df[df['Level'] == level]['Total']. You can loop through the axes using for ax in axes.flatten(). To also get the index, use for ind, ax in enumerate(axes.flatten()). Note that Python normally starts counting from 1, so adding 1 to the index would be a good choice to indicate the level.

Note that when you have backslashes in a string, you can escape them using an r-string: r'home\NewFiles\Files.csv'.

The default ylim is from 0 to the maximum bar height, plus some padding. This can be changed for each ax separately. In the example below a list of ymax values is used to show the principle.

ax.grid(True, axis='both) sets the grid on for that ax. Instead of 'both', also 'x' or 'y' can be used to only set the grid for that axis. A grid line is drawn for each tick value. (The example below tries to use little space, so only a few gridlines are visible.)

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

N = 1000
df = pd.DataFrame({'Level': np.random.randint(1, 6, N), 'Total': np.random.uniform(1, 5, N)})

fig, axes = plt.subplots(nrows=5, ncols=1, sharex=True)
ymax_per_level = [27, 29, 28, 26, 27]
for ind, (ax, lev_ymax) in enumerate(zip(axes.flatten(), ymax_per_level)):
    level = ind + 1
    n_bins = 13
    ax.hist(df[df['Level'] == level]['Total'], bins=n_bins, histtype='bar')
    ax.set_ylabel(f'TL={level}') # to add the level in the ylabel
    ax.set_ylim(0, lev_ymax)
    ax.grid(True, axis='both')
plt.show()

PS: A stacked histogram with custom legend and custom vertical lines could be created as:

import matplotlib.pyplot as plt
from matplotlib.patches import Patch
import pandas as pd
import numpy as np

N = 1000
df = pd.DataFrame({'Level': np.random.randint(1, 6, N),
                   'RFResult': np.random.uniform(1, 5, N),
                   'PRIResult': np.random.uniform(1, 5, N),
                   'PDResult': np.random.uniform(1, 5, N)})
df['Total'] = df['RFResult'] + df['PRIResult'] + df['PDResult']

fig, axes = plt.subplots(nrows=5, ncols=1, sharex=True)
colors = ['crimson', 'limegreen', 'dodgerblue']
column_names = ['RFResult', 'PRIResult', 'PDResult']
level_vertical_line = [1, 2, 3, 4, 5]
for level, (ax, vertical_line) in enumerate(zip(axes.flatten(), level_vertical_line), start=1):
    n_bins = 13
    level_data = df[df['Level'] == level][column_names].to_numpy()
    # vertical_line = level_data.mean()
    ax.hist(level_data, bins=n_bins,
            histtype='bar', stacked=True, color=colors)
    ax.axvline(vertical_line, color='gold', ls=':', lw=2)
    ax.set_ylabel(f'TL={level}')  # to add the level in the ylabel
    ax.margins(x=0.01)
    ax.grid(True, axis='both')
legend_handles = [Patch(color=color) for color in colors]
axes[0].legend(legend_handles, column_names, ncol=len(column_names), loc='lower center', bbox_to_anchor=(0.5, 1.02))
plt.show()

Plot Histogram on different axes

Answers (1)

Related Questions