Reputation: 1365
The pandas.plot.kde()
function is handy for plotting the estimated density function of a continuous random variable. It will take data x as input, and display the probabilities p(x) of the binned input as its output.
How can I extract the values of probabilities it computes? Instead of just plotting the probabilities of bandwidthed samples, I would like an array or pandas series that contains the probability values it internally computed.
If this can't be done with pandas kde, let me know of any equivalent in scipy or other
Upvotes: 14
Views: 12762
Reputation: 2508
there are several ways to do that. You can either compute it yourself or get it from the plot.
data.plot.kde().get_lines()[0].get_xydata()
seaborn
and then the same as in 1):You can use seaborn to estimate the kernel density and then matplotlib
to extract the values (as in this post). You can either use distplot
or kdeplot
:
import seaborn as sns
# kde plot
x,y = sns.kdeplot(data).get_lines()[0].get_data()
# distplot
x,y = sns.distplot(data, hist=False).get_lines()[0].get_data()
pandas
:import scipy.stats
density = scipy.stats.gaussian_kde(data)
and then you can use this to evaluate it on a set of points:
x = np.linspace(0,80,200)
y = density(xs)
Upvotes: 19