Reputation: 357
I have been asked to plot some histograms and KDEs using seaborn. We just want to focus on a range of the X axis, so I use ax.set_xlim(20_000, 42_460)
. In some cases most of the data is before 20.000 so the plot looks like this:
The full plot looks like this:
There is data, but since most of it is on the range (0,20.000) matplotlib adjusts the Y limit to it and in the range (20.000, 42.460) the data cannot be appreciated.
I would like to know a way to automatically adjust the Y limit so the data in the range (20.000, 42.460) is visible. I have been asked to not to plot just the range (20.000, 42.460), I have to plot the range (0, 42.460) and then zoom in the range (20.000, 42.460).
I have found Axes.relim()
that can take an argument visible_only=True
but it does not seem to work as I expected.
Other option could be to use a different library to calculate the histogram data, calculate then the Y limit and set it with ax.set_ylim(0, range_max)
but we are also plotting a seaborn KDE that has the same problem and that could be more complicated. Here it is an immage of a good plot:
EDIT:
To reproduce the plot use this data and this code:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
ages = {
'25-34':'blue',
'35-44': 'orange',
'45-54': 'green',
'55-64': 'red',
}
markers = [plt.Line2D([0,0],[0,0],color=color, marker='o', linestyle='') for color in ages.values()]
data = pd.read_csv("./data.csv")
min = 20_000
max = 42_460
fig = plt.figure(figsize=(10,11))
fig.suptitle("Title", fontsize=12)
fig.legend(markers, ages.keys(),loc='center right')
gs = fig.add_gridspec(3, hspace=0, height_ratios=[5,1,5])
axs = gs.subplots(sharex=True, sharey=False)
sns.histplot(data=data, x="data", bins=200,ax=axs[0])
#axs[1].plot([pos[0] for pos in m1.elevation], [pos[1] for pos in m1.elevation])
sns.kdeplot(data=data, x="data", hue="labels",
common_norm=False,bw_adjust=.25,ax=axs[2]
,legend=False, palette=ages.values(), hue_order=ages.keys())
plt.rcParams['xtick.bottom'] = plt.rcParams['xtick.labelbottom'] = True
plt.rcParams['xtick.top'] = plt.rcParams['xtick.labeltop'] = False
axs[0].set_axisbelow(True)
axs[0].set_xlim(min, max)
axs[0].grid(b=True, which='minor', color='#eeeeee90',lw=0.5)
axs[0].grid(b=True, which='major', color='#cccccc20',lw=0.8)
axs[0].relim(visible_only=True)
axs[1].set_ylim(0, 40)
axs[1].set_xticks(np.arange(min, max, 2500))
axs[1].set_xticks(np.arange(min, max, 500), minor=True)
axs[1].grid(b=True, which='minor', color='#eeeeee90',lw=0.5)
axs[1].grid(b=True, which='major', color='#cccccc20',lw=0.8)
axs[2].set_axisbelow(True)
axs[2].set_xlim(min, max)
axs[2].grid(b=True, which='minor', color='#eeeeee90',lw=0.5)
axs[2].grid(b=True, which='major', color='#cccccc20',lw=0.8)
The plot on the middle has been omited since it was not interesting and made the code simpler.
Upvotes: 2
Views: 1184
Reputation: 7045
The fastest way to achieve what you want is to subset your DataFrame:
# I've renamed these as they override the builtin min/max
min_ = 20_000
max_ = 42_460
# `mask` is an array of True/False that allows
# us to select a subset of the DataFrame
mask = data["data"].between(min_, max_, inclusive=True)
plot_data = data[mask]
If you set cut=0
on the sns.kdeplot
you shouldn't need to set the xlim
for the ax
es, but this may truncate some lines. I've left it out because I think it looks better without it.
Also, as you use sharex
on your subplots, I think you only need to call set_xlim
once.
Then use plot_data
to plot your charts:
ages = {"25-34": "blue", "35-44": "orange", "45-54": "green", "55-64": "red"}
markers = [
plt.Line2D([0, 0], [0, 0], color=color, marker="o", linestyle="")
for color in ages.values()
]
fig = plt.figure(figsize=(10, 11))
fig.suptitle("Title", fontsize=12)
fig.legend(markers, ages.keys(), loc="center right")
gs = fig.add_gridspec(3, hspace=0, height_ratios=[5, 1, 5])
axs = gs.subplots(sharex=True, sharey=False)
sns.histplot(data=plot_data, x="data", bins=200, ax=axs[0])
sns.kdeplot(
data=plot_data,
x="data",
hue="labels",
common_norm=False,
bw_adjust=0.25,
ax=axs[2],
legend=False,
palette=ages.values(),
hue_order=ages.keys(),
# cut=0,
)
axs[0].set_axisbelow(True)
axs[0].set_xlim(min_, max_)
axs[0].grid(b=True, which="minor", color="#eeeeee90", lw=0.5)
axs[0].grid(b=True, which="major", color="#cccccc20", lw=0.8)
# omit ax[1]
axs[2].set_axisbelow(True)
axs[2].set_xlim(min_, max_)
axs[2].grid(b=True, which="minor", color="#eeeeee90", lw=0.5)
axs[2].grid(b=True, which="major", color="#cccccc20", lw=0.8)
Which outputs:
Upvotes: 2