vorpal
vorpal

Reputation: 318

lower bound to kernel density estimation with seaborn for matplotlib in python

I have a collection of measured tree diameters and am trying to plot a histogram with a kernel density estimation superimposed on top in python. The seaborn module lets me do this quite simply but I can find no way of specifying that the kde should be zero for negative numbers (since trees can't have negative tree diameters).

what I've got at present is this:

seaborn.distplot(C77_diam, rug=True, hist=True, kde=True)

I've looked at seaborn.kdeplot which is the function that distplot calls but can't find anything useful. Does anyone know if this can be done with seaborn, and if not if it can be done with matplotlib more generally?

I only started using seaborn because i couldn't figure out how to overlay a kde pyplot.plot() with a pyplot.hist().

Upvotes: 5

Views: 10639

Answers (3)

MrIzzat
MrIzzat

Reputation: 132

If there is an outlier, using .set(xlim=(0, max_diam)) may make the distribution line have pointy edges like in this image:enter image description here

I found a different answer here that uses kde=True, kde_kws=dict(clip=(bins.min(), bins.max())) to limit the calculations to just what is specified in the bins. It generates a smoother distribution line like this:enter image description here

Example usage: sns.histplot(df, x='duration_sec',bins=bins,kde=True, kde_kws=dict(clip=(bins.min(), bins.max())));

Upvotes: 1

mossquatch
mossquatch

Reputation: 21

This is an old thread, but it came up first in my google search for a similar question. In case anyone else lands here like I did: In the years since this question was answered seaborn has added the cut and clip parameters. Setting the cut parameter to 0 truncates the kernel estimations at 0:

seaborn.distplot(C77_diam, rug=True, hist=True, kde=True, cut=0)

Upvotes: 2

mwaskom
mwaskom

Reputation: 48992

There's no way to force the density estimate to zero with that function, but you can always set the axis limits such that the left side of the plot starts at 0.

seaborn.distplot(C77_diam, rug=True, hist=True, kde=True).set(xlim=(0, max_diam))

Upvotes: 16

Related Questions