Reputation: 4118
Consider the code:
import scipy.stats as ss
x = ss.uniform.rvs(np.zeros(5),np.array([1,2,3,4,5]))
I find the documentation for scipy.stats
a bit sparse. From what I can tell, I think the above code is supposed to pick a random number between each of [0,1], [0,2], [0,3], [0,4], and [0,5]. Here is the documentation for rvs and uniform.
Instead, it picks a random number p in [0,1] and returns [p,2p,3p,4p,5p]:
print x, np.diff(x)
[ 0.79352054 1.58704108 2.38056162 3.17408215 3.96760269]
[ 0.79352054 0.79352054 0.79352054 0.79352054]
Is this a seed related bug? Or is this behaviour expected?
Edit: I am aware that it is easy to get around this; no need to tell me how: x=ss.uniform.rvs(size=5)*np.arange(1,5)
. This bug or feature has cost me a couple of days of confusion and debugging in my larger program.
Upvotes: 4
Views: 562
Reputation: 231335
Looks to me like the problem is in uniform.rvs
which tries to handle both the *args
and size
. If I first create a uniform
object, and then call rvs
it appears to behave.
To produce 3 uniform distributions, over the ranges [0,1), [5,7), [10,13), I can define a uniform
object with range starts of 0,5,10, and range sets of 1,2,3:
In [543]: u=stats.uniform(np.array([0,5,10]),np.array([1,2,3]))
Now I can generate any size distribution that has a compatible size 3 dimension:
In [544]: x = u.rvs((5,3))
In [545]: x
Out[545]:
array([[ 0.28689704, 6.60720428, 12.78343224],
[ 0.3058824 , 6.22486472, 11.5212319 ],
[ 0.32274603, 6.72905376, 10.90760859],
[ 0.98299464, 5.39877562, 12.00342556],
[ 0.76728063, 5.26172895, 10.38177301]])
In [546]: x.mean(axis=0)
Out[546]: array([ 0.53316015, 6.04432547, 11.51949426])
This may be just another way around the missing size
parameter in the stats.uniform.rvs
call.
Upvotes: 0
Reputation: 114781
It's a bug: https://github.com/scipy/scipy/issues/2069
A different work-around for your example is to give the size
argument explicitly along with the arguments that you are already using.
For example, here's the buggy case:
In [1]: import scipy.stats as ss
In [2]: x = ss.uniform.rvs(np.zeros(5), np.array([1,2,3,4,5]))
In [3]: x
Out[3]: array([ 0.23848443, 0.47696885, 0.71545328, 0.9539377 , 1.19242213])
In [4]: x/x[0]
Out[4]: array([ 1., 2., 3., 4., 5.])
The work-around is to include the argument size=5
:
In [18]: x = ss.uniform.rvs(np.zeros(5), np.array([1,2,3,4,5]), size=5)
In [19]: x
Out[19]: array([ 0.67638863, 1.2253443 , 0.0812362 , 3.87469514, 3.88145975])
In [20]: x/x[0]
Out[20]: array([ 1. , 1.81159802, 0.12010285, 5.72850428, 5.73850534])
Upvotes: 4