edoelas
edoelas

Reputation: 357

Setting Y limit of matplotlib range automatically

I have been asked to plot some histograms and KDEs using seaborn. We just want to focus on a range of the X axis, so I use ax.set_xlim(20_000, 42_460). In some cases most of the data is before 20.000 so the plot looks like this:

Plot from 20.000 to 42.460

The full plot looks like this:

Full plot

There is data, but since most of it is on the range (0,20.000) matplotlib adjusts the Y limit to it and in the range (20.000, 42.460) the data cannot be appreciated.

I would like to know a way to automatically adjust the Y limit so the data in the range (20.000, 42.460) is visible. I have been asked to not to plot just the range (20.000, 42.460), I have to plot the range (0, 42.460) and then zoom in the range (20.000, 42.460).

I have found Axes.relim() that can take an argument visible_only=True but it does not seem to work as I expected.

Other option could be to use a different library to calculate the histogram data, calculate then the Y limit and set it with ax.set_ylim(0, range_max) but we are also plotting a seaborn KDE that has the same problem and that could be more complicated. Here it is an immage of a good plot:

Good plot

EDIT:

To reproduce the plot use this data and this code:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

ages = {
    '25-34':'blue',
    '35-44': 'orange',
    '45-54': 'green',
    '55-64': 'red',
}
markers = [plt.Line2D([0,0],[0,0],color=color, marker='o', linestyle='') for color in ages.values()]

data = pd.read_csv("./data.csv")

min = 20_000
max = 42_460

fig = plt.figure(figsize=(10,11))
fig.suptitle("Title", fontsize=12)
fig.legend(markers, ages.keys(),loc='center right')
gs = fig.add_gridspec(3, hspace=0, height_ratios=[5,1,5])
axs = gs.subplots(sharex=True, sharey=False)

sns.histplot(data=data, x="data", bins=200,ax=axs[0])
#axs[1].plot([pos[0] for pos in m1.elevation], [pos[1] for pos in m1.elevation])
sns.kdeplot(data=data, x="data", hue="labels",
    common_norm=False,bw_adjust=.25,ax=axs[2]
    ,legend=False, palette=ages.values(), hue_order=ages.keys())

plt.rcParams['xtick.bottom'] = plt.rcParams['xtick.labelbottom'] = True
plt.rcParams['xtick.top'] = plt.rcParams['xtick.labeltop'] = False

axs[0].set_axisbelow(True)
axs[0].set_xlim(min, max)
axs[0].grid(b=True, which='minor', color='#eeeeee90',lw=0.5)
axs[0].grid(b=True, which='major', color='#cccccc20',lw=0.8)
axs[0].relim(visible_only=True)

axs[1].set_ylim(0, 40)
axs[1].set_xticks(np.arange(min, max, 2500))
axs[1].set_xticks(np.arange(min, max, 500), minor=True)
axs[1].grid(b=True, which='minor', color='#eeeeee90',lw=0.5)
axs[1].grid(b=True, which='major', color='#cccccc20',lw=0.8)

axs[2].set_axisbelow(True)
axs[2].set_xlim(min, max)
axs[2].grid(b=True, which='minor', color='#eeeeee90',lw=0.5)
axs[2].grid(b=True, which='major', color='#cccccc20',lw=0.8)

The plot on the middle has been omited since it was not interesting and made the code simpler.

Upvotes: 2

Views: 1184

Answers (1)

Alex
Alex

Reputation: 7045

The fastest way to achieve what you want is to subset your DataFrame:

# I've renamed these as they override the builtin min/max
min_ = 20_000
max_ = 42_460

# `mask` is an array of True/False that allows 
#   us to select a subset of the DataFrame
mask = data["data"].between(min_, max_, inclusive=True)
plot_data = data[mask]

If you set cut=0 on the sns.kdeplot you shouldn't need to set the xlim for the axes, but this may truncate some lines. I've left it out because I think it looks better without it.

Also, as you use sharex on your subplots, I think you only need to call set_xlim once.

Then use plot_data to plot your charts:

ages = {"25-34": "blue", "35-44": "orange", "45-54": "green", "55-64": "red"}
markers = [
    plt.Line2D([0, 0], [0, 0], color=color, marker="o", linestyle="")
    for color in ages.values()
]

fig = plt.figure(figsize=(10, 11))
fig.suptitle("Title", fontsize=12)
fig.legend(markers, ages.keys(), loc="center right")
gs = fig.add_gridspec(3, hspace=0, height_ratios=[5, 1, 5])
axs = gs.subplots(sharex=True, sharey=False)

sns.histplot(data=plot_data, x="data", bins=200, ax=axs[0])

sns.kdeplot(
    data=plot_data,
    x="data",
    hue="labels",
    common_norm=False,
    bw_adjust=0.25,
    ax=axs[2],
    legend=False,
    palette=ages.values(),
    hue_order=ages.keys(),
    # cut=0,
)

axs[0].set_axisbelow(True)
axs[0].set_xlim(min_, max_)
axs[0].grid(b=True, which="minor", color="#eeeeee90", lw=0.5)
axs[0].grid(b=True, which="major", color="#cccccc20", lw=0.8)

# omit ax[1]

axs[2].set_axisbelow(True)
axs[2].set_xlim(min_, max_)
axs[2].grid(b=True, which="minor", color="#eeeeee90", lw=0.5)
axs[2].grid(b=True, which="major", color="#cccccc20", lw=0.8)

Which outputs:

enter image description here

Upvotes: 2

Related Questions