Reputation: 2816
let's say we have a service to which # of requests are coming and we are adding those requests on an hourly basis like from 12-1
and 1-2
etc. So what I want to do is to generate these number of requests which follow Poisson arrival and then add this to a dictionary representing a day of week
monday = [hour_range, number_of_clients_in_that_hour]
Then at the end, we will have these 7 dictionaries named from Mon to Sunday
and on which some linear regression can be used to predict # of clients for next hour of a given day.
So basically, as I am simulating this scenario in python, I need to make an arrival which will represent this kind of scenario. I have following code, using which I generate # of clients in an hour
using uniform distribution. how can I do it for Poisson arrival or any other arrival which truly represents such scenario? My code is as follow
day_names = ['mon','tue','wed','thurs','fri','sat','sun']
time_values = np.linspace(1,23,23,dtype='int') # print from 1,2...23
for day_iterator in range(1,7+1):
number_of_clients = [] # create empty list that will hold number of clients
for i in range(1,24,1): # lets create no. of clients for a day on an hourly basis in this for loop
rand_value = random.randint(1,20) # generate number of clients
number_of_clients.append(rand_value) # append the number of clients to this list
# a single day data is generated after this for
locals() [day_names[day_iterator-1]] = dict(zip(time_values,number_of_clients)) # create dict for each day of a week
# print each day
print "monday = %s"%mon
print "tuesday = %s"%tue
print "wed = %s"%wed
print "thurs = %s"%thurs
print "fri = %s"%fri
print "sat = %s"%sat
print "sun = %s"%sun
plt.plot(mon.keys(),mon.values())
Upvotes: 0
Views: 314
Reputation: 19855
The path of least resistance is to use the built-in Poisson generator from numpy. However, if you want to roll your own the following code will do the trick:
import math
import random
def poisson(rate):
x = 0
product = random.random()
threshold = math.exp(-rate)
while product >= threshold:
product *= random.random()
x += 1
return x
This is based on the fact that Poisson events have exponentially distributed interarrival times, so you can generate exponentials until their sum exceeds your specified rate. This implementation is slightly more clever though—by exponentiating both sides of the summation/threshold relationship, the sum of logarithmic evaluations turns into simple multiplication, and the result can be compared to a pre-calculated exponentiated threshold. This is algebraically identical to the sum of exponential random variates but it performs a single exponentiation and an average of lambda multiplications, rather than summing an average of lambda log evaluations.
Finally, whichever generator you use you need to know the rate. Bearing in mind that poisson is the French word for fish, one of the worst jokes in prob & stats is the statement "the Poisson scales." This means that the hourly rate can be converted to a daily rate by simply multiplying by 24, the number of hours in a day. For example, if you have an average of 3 per hour, you will have an average of 72 per day.
Upvotes: 1
Reputation: 21643
The inter-arrival times for a Poisson process (with the usual simplifying assumptions) are exponentially distributed. In this kind of modelling work then, it's the inter-arrival times that are often used rather than the parent process.
Here's how you can get a count for each hour of a Poisson process using a well-known Python library. Notice that scale
in the inverse of the Poisson parameter.
>>> def hourly_arrivals(scale=1):
... count = 0
... while expon.rvs(scale=scale, size=1) < 1:
... count += 1
... return count
...
>>> hourly_arrivals()
0
>>> hourly_arrivals()
8
>>> hourly_arrivals()
0
>>> hourly_arrivals()
1
>>> hourly_arrivals()
4
>>> hourly_arrivals()
0
>>> hourly_arrivals()
2
You have also asked about 'any other arrival which truly represents such scenario'. This is an empirical problem. I would say, gather as many steady-state inter-arrival times as you can for the system you are studying and try to fit a cumulative distribution function to them. If you would like to discuss that please put another question.
Upvotes: 1