Ian Hincks
Ian Hincks

Reputation: 4118

Unexpected scipy.stats.uniform behaviour

Consider the code:

import scipy.stats as ss
x = ss.uniform.rvs(np.zeros(5),np.array([1,2,3,4,5]))

I find the documentation for scipy.stats a bit sparse. From what I can tell, I think the above code is supposed to pick a random number between each of [0,1], [0,2], [0,3], [0,4], and [0,5]. Here is the documentation for rvs and uniform.

Instead, it picks a random number p in [0,1] and returns [p,2p,3p,4p,5p]:

print x, np.diff(x)
[ 0.79352054  1.58704108  2.38056162  3.17408215  3.96760269] 
[ 0.79352054  0.79352054  0.79352054  0.79352054]

Is this a seed related bug? Or is this behaviour expected?

Edit: I am aware that it is easy to get around this; no need to tell me how: x=ss.uniform.rvs(size=5)*np.arange(1,5). This bug or feature has cost me a couple of days of confusion and debugging in my larger program.

Upvotes: 4

Views: 562

Answers (2)

hpaulj
hpaulj

Reputation: 231335

Looks to me like the problem is in uniform.rvs which tries to handle both the *args and size. If I first create a uniform object, and then call rvs it appears to behave.

To produce 3 uniform distributions, over the ranges [0,1), [5,7), [10,13), I can define a uniform object with range starts of 0,5,10, and range sets of 1,2,3:

In [543]: u=stats.uniform(np.array([0,5,10]),np.array([1,2,3]))

Now I can generate any size distribution that has a compatible size 3 dimension:

In [544]: x = u.rvs((5,3))
In [545]: x
Out[545]: 
array([[  0.28689704,   6.60720428,  12.78343224],
       [  0.3058824 ,   6.22486472,  11.5212319 ],
       [  0.32274603,   6.72905376,  10.90760859],
       [  0.98299464,   5.39877562,  12.00342556],
       [  0.76728063,   5.26172895,  10.38177301]])
In [546]: x.mean(axis=0)
Out[546]: array([  0.53316015,   6.04432547,  11.51949426])

This may be just another way around the missing size parameter in the stats.uniform.rvs call.

Upvotes: 0

Warren Weckesser
Warren Weckesser

Reputation: 114781

It's a bug: https://github.com/scipy/scipy/issues/2069

A different work-around for your example is to give the size argument explicitly along with the arguments that you are already using.

For example, here's the buggy case:

In [1]: import scipy.stats as ss

In [2]: x = ss.uniform.rvs(np.zeros(5), np.array([1,2,3,4,5]))

In [3]: x
Out[3]: array([ 0.23848443,  0.47696885,  0.71545328,  0.9539377 ,  1.19242213])

In [4]: x/x[0]
Out[4]: array([ 1.,  2.,  3.,  4.,  5.])

The work-around is to include the argument size=5:

In [18]: x = ss.uniform.rvs(np.zeros(5), np.array([1,2,3,4,5]), size=5)

In [19]: x
Out[19]: array([ 0.67638863,  1.2253443 ,  0.0812362 ,  3.87469514,  3.88145975])

In [20]: x/x[0]
Out[20]: array([ 1.        ,  1.81159802,  0.12010285,  5.72850428,  5.73850534])

Upvotes: 4

Related Questions