Hatshepsut
Hatshepsut

Reputation: 6662

Evaluate normal cdfs at each of several points

I want to evaluate several normal CDFs, defined by a 4x3 grid of points, at each of 5 points.

import numpy as np
import scipy.stats


a = np.array([-1, 0, 1])
b = np.array([1, 2, 3, 4])
x = np.array([-.5, 0, .5, 1, 2])
grid_a, grid_b = np.meshgrid(a,b)
scipy.stats.norm(loc=grid_a, scale=grid_b).cdf(x)

Raises this exception:

ValueErrorTraceback (most recent call last)                                                                                     
 <ipython-input-46-82423c7451d2> in <module>()                                                                                   
       3 x = np.array([-.5, 0, .5, 1, 2])                                                                                        
       4 grid_a, grid_b = np.meshgrid(a,b)                                                                                       
 ----> 5 scipy.stats.norm(loc=grid_a, scale=grid_b).cdf(x)                                                                       

 ~/.envs/practice/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py in cdf(self, x)                               
     454                                                                                                                         
     455     def cdf(self, x):                                                                                                   
 --> 456         return self.dist.cdf(x, *self.args, **self.kwds)                                                                
     457                                                                                                                         
     458     def logcdf(self, x):                                                                                                

 ~/.envs/practice/lib/python3.6/site-packages/scipy/stats/_distn_infrastructure.py in cdf(self, x, *args, **kwds)                
    1733         args = tuple(map(asarray, args))                                                                                
    1734         dtyp = np.find_common_type([x.dtype, np.float64], [])                                                           
 -> 1735         x = np.asarray((x - loc)/scale, dtype=dtyp)                                                                     
    1736         cond0 = self._argcheck(*args) & (scale > 0)                                                                     
    1737         cond1 = self._open_support_mask(x) & (scale > 0)                                                                

 ValueError: operands could not be broadcast together with shapes (5,) (4,3)                                                     

Upvotes: 0

Views: 73

Answers (1)

Warren Weckesser
Warren Weckesser

Reputation: 114946

You have to reshape a, b and x to be compatible for broadcasting. You can do this, for example, by adding one trivial dimension to a and two trivial dimensions to b. That is, use a[:, None] (which has shape (3, 1)) and b[:, None, None] (which has shape (4, 1, 1)). (Instead of None, you might prefer the more explicit np.newaxis, but its value is just None.) Then with x having shape (5,) and the reshaped a and b having shapes (3, 1) and (4, 1, 1), respectively, the shape of the computed result with broadcasting is (4, 3, 5):

In [45]: from scipy.stats import norm

In [46]: a = np.array([-1, 0, 1])

In [47]: b = np.array([1, 2, 3, 4])

In [48]: x = np.array([-.5, 0, .5, 1, 2])

In [49]: c = norm.cdf(x, loc=a[:, None], scale=b[:, None, None])

In [50]: c.shape
Out[50]: (4, 3, 5)

In [51]: c
Out[51]: 
array([[[0.69146246, 0.84134475, 0.9331928 , 0.97724987, 0.9986501 ],
        [0.30853754, 0.5       , 0.69146246, 0.84134475, 0.97724987],
        [0.0668072 , 0.15865525, 0.30853754, 0.5       , 0.84134475]],

       [[0.59870633, 0.69146246, 0.77337265, 0.84134475, 0.9331928 ],
        [0.40129367, 0.5       , 0.59870633, 0.69146246, 0.84134475],
        [0.22662735, 0.30853754, 0.40129367, 0.5       , 0.69146246]],

       [[0.56618383, 0.63055866, 0.69146246, 0.74750746, 0.84134475],
        [0.43381617, 0.5       , 0.56618383, 0.63055866, 0.74750746],
        [0.30853754, 0.36944134, 0.43381617, 0.5       , 0.63055866]],

       [[0.54973822, 0.59870633, 0.64616977, 0.69146246, 0.77337265],
        [0.45026178, 0.5       , 0.54973822, 0.59870633, 0.69146246],
        [0.35383023, 0.40129367, 0.45026178, 0.5       , 0.59870633]]])

It also works to use the cdf() method of the "frozen" distribution norm(loc=a[:, None], scale=b[:, None, None]) like you did in the question:

In [52]: c = norm(loc=a[:, None], scale=b[:, None, None]).cdf(x)

In [53]: c.shape
Out[53]: (4, 3, 5)

Upvotes: 2

Related Questions