Reputation:
Suppose i have the following two arrays with means and standard deviations:
mu = np.array([2000, 3000, 5000, 1000])
sigma = np.array([250, 152, 397, 180])
Then:
a = np.random.normal(mu, sigma)
In [1]: a
Out[1]: array([1715.6903716 , 3028.54168667, 4731.34048645, 933.18903575])
However, if i ask for 100 draws for each element of mu, sigma:
a = np.random.normal(mu, sigma, 100)
a = np.random.normal(mu, sigma, 100)
Traceback (most recent call last):
File "<ipython-input-417-4aadd7d15875>", line 1, in <module>
a = np.random.normal(mu, sigma, 100)
File "mtrand.pyx", line 1652, in mtrand.RandomState.normal
File "mtrand.pyx", line 265, in mtrand.cont2_array
ValueError: shape mismatch: objects cannot be broadcast to a single shape
I have also tried using a tuple for size(s):
s = (100, 100, 100, 100)
a = np.random.normal(mu, sigma, s)
What am i missing?
Upvotes: 10
Views: 7493
Reputation: 9
This is an old question but I had the same issue recently and the documentation is still not clear at present, so my answer may be useful to other people.
The thing is that if you want to draw n_sample
samples from (uncorrelated) normal distributions with n_param
different parameters, the size
argument of the function needs to be a tuple (n_sample, n_param)
. Back to your example :
mu = np.array([2000, 3000, 5000, 1000])
sigma = np.array([250, 152, 397, 180])
n_sample = 10
n_param = len(mu)
np.random.normal(mu, sigma, (n_sample, n_param))
which returns
array([[2048.27840802, 2997.96810385, 4388.76381537, 834.58578664],
[2284.62302217, 3057.37011582, 5141.42601472, 757.21437687],
[1933.16814182, 3060.13736788, 5431.56812414, 949.80295487],
[2444.69699622, 3049.32584965, 4850.82175943, 772.26041345],
[2129.87928253, 2976.20614441, 5140.33783836, 1017.96741881],
[1906.47137372, 2829.44037933, 4894.20964032, 1245.29240452],
[2031.94886175, 2693.19106648, 5385.33674047, 849.72485587],
[2034.22639971, 3017.86916011, 5050.08920701, 1198.48286148],
[2278.8297283 , 3036.31308636, 5043.93694099, 988.87438521],
[1760.04486593, 2875.0750094 , 4615.1775128 , 946.76458665]])
Upvotes: 0
Reputation: 504
If you want to make only one call, the normal distribution is easy enough to shift and rescale after the fact. (I'm making up a 10000-long vector of mu
and sigma
from your example here):
mu = np.random.choice([2000., 3000., 5000., 1000.], 10000)
sigma = np.random.choice([250., 152., 397., 180.], 10000)
a = np.random.normal(size=(10000, 100)) * sigma[:,None] + mu[:,None]
This works fine. You can decide if speed is an issue. On my system the following is just 50% slower:
a = np.array([np.random.normal(m, s, 100) for m,s in zip(mu, sigma)])
Upvotes: 2
Reputation: 402483
I don't believe you can control the size parameter when you pass a list/vector of values for the mean and std. Instead, you can iterate over each pair and then concatenate:
np.concatenate(
[np.random.normal(m, s, 100) for m, s in zip(mu, sigma)]
)
This gives you a (400, )
array. If you want a (4, 100)
array instead, call np.array
instead of np.concatenate
.
Upvotes: 3