Reputation: 101
I am trying to get only non-negative values on the x-axis on the plot for my KDE. I know I can limit the x-axis values but I do not want that. Is there way to smoothly approximate the KDE such that there are no non-negative value? All my data are non-negative but I do not have a lot of sample points(max 500 and I cannot get more). I have also tried to adjust the bandwidth and its not looking nice.
for i in range(len(B)):
ax = sns.kdeplot(data[i],shade=True)
ax.set_xlabel('Maimum detection time')
ax.legend(['N=25,R=20', 'N=30,R=20', 'N=35,R=20'],fontsize=5)
plt.show()
Upvotes: 3
Views: 3026
Reputation: 46958
What goes on behind kdeplot is that a kernel density is fitted with many little normal density (see this illustration) and the densities at the very edge of the truncation cutoff spill over.
Using an example data:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
import statsmodels.api as sm
from scipy.stats import norm
np.random.seed(999)
data = pd.DataFrame({'a':np.random.exponential(0.3,100),
'b':np.random.exponential(0.5,100)})
If you use clip=
, it doesn't stop the evaluation at negative values:
for i in data.columns:
ax = sns.kdeplot(data[i],shade=True,gridsize=200)
If you add cut=0
, it will look odd. As you pointed out, you can truncate it at 0:
There are two solutions proposed in this post on cross-validated. I write a python implementation of the R code provided by @whuber:
def trunc_dens(x):
kde = sm.nonparametric.KDEUnivariate(x)
kde.fit()
h = kde.bw
w = 1/(1-norm.cdf(0,loc=x,scale=h))
d = sm.nonparametric.KDEUnivariate(x)
d = d.fit(bw=h,weights=w / len(x),fft=False)
d_support = d.support
d_dens = d.density
d_dens[d_support<0] = 0
return d_support,d_dens
We can check how it looks for data['a']
:
kde = sm.nonparametric.KDEUnivariate(data['a'])
kde.fit()
plt.plot(kde.support,kde.density)
_x,_y = trunc_dens(data['a'])
plt.plot(_x,_y)
You can plot it for both:
fig,ax = plt.subplots()
for i in data.columns:
_x,_y = trunc_dens(data[i])
ax.plot(_x,_y)
Upvotes: 4