Reputation: 831
I'm trying to model a process that has arrival times. I have sampled actual arrivals and have a series of arrival counts per day for many days. I want to use this measured data to make a series of actual arrival timestamps that follow a Poisson distribution.
For example, given: countPerDay = [2,3,1,...] compute: arrivalTimes = [0.324, 0.547, 1.223, 1.563, 1.844, 2.618, ...]
Observe that 2 arrive on the first day, 3 on the second day, 1 on the third day, etc.
I currently do this using a uniform distribution as follows:
arrivalTimes = []
for d,j in zip(range(len(countPerDay)), countPerDay):
l = random.sample(range(ticksPerDay), j)
arrivalTimes += [(d*ticksPerDay + v) for v in l]
How do I change this such that the arrival times are on a Poisson distribution rather than uniform? I know that the Exponential distribution is meant to provide Poisson inter-arrival times, but in this case where I need an exact number of arrivals per day I worry that it will bias all of the arrivals toward the beginning of each day.
And intuitively, what is different / better about Poisson arrival times than uniform?
Upvotes: 1
Views: 1534
Reputation: 306
Your code is fine because it turns out that a randomly distributed set of arrival times will have subsequent times exponentially-distributed. To test, I used the following code:
n = int(1e8) # Many points
event_times = n * np.random.rand(n)
event_times.sort()
event_distances = event_times[1:] - event_times[:-1]
plt.hist(event_distances, bins=100)
plt.xlim(0, 8) # To show the part with high n
event_distances.mean()
Which returns a mean of 0.99999996740170383 and the following distribution:
Upvotes: 1