Reputation: 11922
I'm performing a simulation of a simple queue using SimPy
. One of the questions about the system is what is the distribution of the waiting times by a visitor. What I do is draw a normalized histogram of the sample I get during the simulation process.
This distribution is not purely continuous, we have a non-zero probability of the waiting time being exactly zero, hence the peak near the left end. I want it to be somehow obvious from the picture, what is the actual probability of hitting 0
exactly. Right now the height of the peak does not visualize that properly, the height is even higher than one (the reason is that many points are hitting a small segment near zero).
So the question is the general visualization technique of such distributions that are mixtures of a continuous and a discrete one.
Upvotes: 1
Views: 1106
Reputation: 26080
(based on the discussion in the comments to OP).
For a distribution of some variable, call it t
, being a mixture of a discrete and and continuous components, I'd write the pdf a sum of a set of delta-peaks and a continuous part,
p(t) = \sum_{a} p_a \delta(t-t_a) + f(t)
where a
enumerates the discrete values t_a
and p_a
are probabilities of t_a
, and f(t)
is the pdf for the continuous part of the distribution, so that f(t)dt
is the probability for t
to belong to [t,t+dt)
.
Notice that the whole thing is normalized, \int p(t) =1
where the integral is over the approprite range of t
.
Now, for visualizing this, I'd separate the discrete components, and plot them as discrete values (either as narrow bins or as points with droplines etc). Then for the rest, I'd use the histogram where you know the correct normalization from the equation above: the area under the histogram should sum up to 1-\sum_a p_a
.
I'm not claiming this being the way, it's just what I'd do.
Upvotes: 1