Reputation: 91
Is there a way to plot the percentage instead of the count on a distplot?
ax = sns.FacetGrid(telcom, hue='Churn', palette=["teal", "crimson"], size=5, aspect=1)
ax = ax.map(sns.distplot, "tenure", hist=True, kde=False)
ax.fig.suptitle('Tenure distribution in customer churn', y=1, fontsize=16, fontweight='bold');
plt.legend();
Upvotes: 7
Views: 27008
Reputation: 62403
seaborn 0.11.2
seaborn.distplot
is replaced with the Figure level seaborn.displot
and Axes level seaborn.histplot
, which have a stat
parameter. Use stat='percent'
.common_bins
and common_norm
.
common_norm=True
will show the percent as a part of the entire population, whereas False
will show the percent relative to the group.import seaborn as sns
import matplotlib.pyplot as ply
# data
data = sns.load_dataset('titanic')
p = sns.displot(data=data, x='age', stat='percent', hue='sex', height=3)
plt.show()
p = sns.displot(data=data, x='age', stat='percent', col='sex', height=3)
plt.show()
:=
) used in labels
requires python >= 3.8
. This can be implemented with a for-loop
, without using :=
.fg = sns.displot(data=data, x='age', stat='percent', col='sex', height=3.5, aspect=1.25)
for ax in fg.axes.ravel():
# add annotations
for c in ax.containers:
# custom label calculates percent and add an empty string so 0 value bars don't have a number
labels = [f'{w:0.1f}%' if (w := v.get_height()) > 0 else '' for v in c]
ax.bar_label(c, labels=labels, label_type='edge', fontsize=8, rotation=90, padding=2)
ax.margins(y=0.2)
plt.show()
fig = plt.figure(figsize=(4, 3))
p = sns.histplot(data=data, x='age', stat='percent', hue='sex')
plt.show()
common_norm=
parameterp = sns.displot(data=data, x='age', stat='percent', hue='sex', height=4, common_norm=False)
p = sns.displot(data=data, x='age', stat='percent', col='sex', height=4, common_norm=False)
fig = plt.figure(figsize=(5, 4))
p = sns.histplot(data=data, x='age', stat='percent', hue='sex', common_norm=False)
plt.show()
Upvotes: 16
Reputation: 3961
You could chose a barplot, and set an estimator defining the normalization in percentages:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame(dict(x=np.random.poisson(10, 1_000)))
ax = sns.barplot(x="x",
y="x",
data=df,
palette=["teal", "crimson"],
estimator=lambda x: len(x) / len(df) * 100
)
ax.set(xlabel="tenure")
ax.set(ylabel="Percent")
plt.show()
Giving:
Upvotes: 0
Reputation: 308
You could use norm_hist = True
.
From the documentation:
norm_hist : bool, optional
If True, the histogram height shows a density rather than a count. This is implied if a KDE or fitted density is plotted.
Upvotes: 1