Reputation: 1439
I have the following data
name val
G.Kittle 4.0
G.Kittle 10.0
D.Hopkins 3.0
L.Fitzgerald 6.0
... ...
C.Kupp 18.0
R.Woods 21.0
N.Harry 7.0
S.Michel -6.0
Each name
has many values, and I would like to plot a distribution for each name on the same figure. I tried doing this using the hue
argument, but that messed everything up and treated all distributions as having an area of 1 together, however, I want each distribution to be independent from each other and have their own area of 1. Does that make sense? I would also like all of them to be gray, which hue
doesn't allow naturally.
Edit: Also, when I use hue
, I get this error UserWarning: Dataset has 0 variance; skipping density estimate.
Upvotes: 0
Views: 3498
Reputation: 80289
sns.kdeplot()
has a parameter common_norm=
which default to True
. In that case, the kde curves will be scaled proportionally to the number of values such that the total area sums to 1. Setting common_norm=False
shows all the kde curves such that each individually has an area of one.
Note that there also is a multiple=
parameter, defaulting to “layer”
, but which also can be set to “stack”
or “fill”
. In that case the common norm would be appropriate.
The curves can all be colored grey providing a palette as a list of colors with 'grey'. The length of the list should be the same as the number of hue values. As all hue values are the same, a legend would look strange. The legend can be suppressed with legend=False
.
When a hue value only appears in one row, the kdeplot with one element isn't drawn, but shows the warning Dataset has 0 variance; skipping density estimate
.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
df = pd.DataFrame({'name': np.random.choice([*'ABCD'], 100, p=[0.4, 0.3, 0.2, 0.1]),
'val': np.random.rand(100).cumsum()})
df.loc[0, 'name'] = 'E' # exactly one row with name 'E'
df['name'] = df['name'].astype('category')
sns.kdeplot(data=df, x='val', hue='name', palette=['grey'] * len(df['name'].cat.categories),
common_norm=False, legend=False)
plt.show()
Upvotes: 1