prodev_paris
prodev_paris

Reputation: 515

Unpredictable poisson noise

I'm in the process of comparing two sets of values for which I apply poisson noise. Below is my code and the corresponding result:

import numpy as np
import pylab

size = 14000

# 1) Creating first array
np.random.seed(1)
sample = np.zeros((size),dtype="int")+1000
# Applying poisson noise
random_sample1 = np.random.poisson(sample)

# 2) Creating the second array (with some changed values)
# Update some of the value to 2000...
for x in range(size):
  if not(x%220):
    sample[x]=2000
# Reset the seed to the SAME as for the first array
# so that poisson shall rely on same random.
np.random.seed(1)
# Applying poisson noise
random_sample2 = np.random.poisson(sample)

# Display diff result
pylab.plot(random_sample2-random_sample1)
pylab.show()

poisson_pb

My question is: why does I have this strange values around [10335-12542] where I would expect just a perfect zero?

I search for info in poisson() documentation without success.

I (only) test and reproduce the problem in python version 1.7.6 and 1.7.9 (It may appear on others). Numpy version tested: 1.6.2 and 1.9.2

More details if I print related values:

random_sample1[10335:10345]
[ 977 1053  968 1032 1051  953 1036 1035  967  954]
#  OK  OK    OK   OK   OK  OK!  ???  ???  ???  ???
random_sample2[10335:10345]
[ 977 1053  968 1032 1051 2051 1035  967  954 1034]
#  OK  OK    OK   OK   OK  OK!  ???  ???  ???  ???

We clearly see that values up to index 10339 are exactly the same then for index 10340 it change since we have sample[10340] == 2000 which is what we want. But then the next values are not what we expect to be! They appear to be shifted from 1 index!

Upvotes: 2

Views: 516

Answers (1)

Rob
Rob

Reputation: 3513

This is implicit in the algorithm to calculate the random sample of the poisson distribution. See the source code here.

The random sample is calculated in a conditional loop, which gets a new random value and returns when this value is above some threshold based on lambda. For different lambda's it might take a different number of tries. The following random values will then be offset, leading to the different results you see. Lateron, the random values are synced up again.

In your specific example, it uses one extra random value to get sample #10340. After that, all values are offset by one.

Upvotes: 2

Related Questions