Reputation: 6662
I want to evaluate several normal CDFs, defined by a 4x3 grid of points, at each of 5 points.
import numpy as np
import scipy.stats
a = np.array([-1, 0, 1])
b = np.array([1, 2, 3, 4])
x = np.array([-.5, 0, .5, 1, 2])
grid_a, grid_b = np.meshgrid(a,b)
scipy.stats.norm(loc=grid_a, scale=grid_b).cdf(x)
Raises this exception:
ValueErrorTraceback (most recent call last)
<ipython-input-46-82423c7451d2> in <module>()
3 x = np.array([-.5, 0, .5, 1, 2])
4 grid_a, grid_b = np.meshgrid(a,b)
----> 5 scipy.stats.norm(loc=grid_a, scale=grid_b).cdf(x)
~/.envs/practice/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py in cdf(self, x)
454
455 def cdf(self, x):
--> 456 return self.dist.cdf(x, *self.args, **self.kwds)
457
458 def logcdf(self, x):
~/.envs/practice/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py in cdf(self, x, *args, **kwds)
1733 args = tuple(map(asarray, args))
1734 dtyp = np.find_common_type([x.dtype, np.float64], [])
-> 1735 x = np.asarray((x - loc)/scale, dtype=dtyp)
1736 cond0 = self._argcheck(*args) & (scale > 0)
1737 cond1 = self._open_support_mask(x) & (scale > 0)
ValueError: operands could not be broadcast together with shapes (5,) (4,3)
Upvotes: 0
Views: 73
Reputation: 114946
You have to reshape a
, b
and x
to be compatible for broadcasting. You can do this, for example, by adding one trivial dimension to a
and two trivial dimensions to b
. That is, use a[:, None]
(which has shape (3, 1)) and b[:, None, None]
(which has shape (4, 1, 1)). (Instead of None
, you might prefer the more explicit np.newaxis
, but its value is just None
.) Then with x
having shape (5,) and the reshaped a
and b
having shapes (3, 1) and (4, 1, 1), respectively, the shape of the computed result with broadcasting is (4, 3, 5):
In [45]: from scipy.stats import norm
In [46]: a = np.array([-1, 0, 1])
In [47]: b = np.array([1, 2, 3, 4])
In [48]: x = np.array([-.5, 0, .5, 1, 2])
In [49]: c = norm.cdf(x, loc=a[:, None], scale=b[:, None, None])
In [50]: c.shape
Out[50]: (4, 3, 5)
In [51]: c
Out[51]:
array([[[0.69146246, 0.84134475, 0.9331928 , 0.97724987, 0.9986501 ],
[0.30853754, 0.5 , 0.69146246, 0.84134475, 0.97724987],
[0.0668072 , 0.15865525, 0.30853754, 0.5 , 0.84134475]],
[[0.59870633, 0.69146246, 0.77337265, 0.84134475, 0.9331928 ],
[0.40129367, 0.5 , 0.59870633, 0.69146246, 0.84134475],
[0.22662735, 0.30853754, 0.40129367, 0.5 , 0.69146246]],
[[0.56618383, 0.63055866, 0.69146246, 0.74750746, 0.84134475],
[0.43381617, 0.5 , 0.56618383, 0.63055866, 0.74750746],
[0.30853754, 0.36944134, 0.43381617, 0.5 , 0.63055866]],
[[0.54973822, 0.59870633, 0.64616977, 0.69146246, 0.77337265],
[0.45026178, 0.5 , 0.54973822, 0.59870633, 0.69146246],
[0.35383023, 0.40129367, 0.45026178, 0.5 , 0.59870633]]])
It also works to use the cdf()
method of the "frozen" distribution norm(loc=a[:, None], scale=b[:, None, None])
like you did in the question:
In [52]: c = norm(loc=a[:, None], scale=b[:, None, None]).cdf(x)
In [53]: c.shape
Out[53]: (4, 3, 5)
Upvotes: 2