Reputation: 648
How to evaluate log(1 - normal_cdf(x))
in a numerically stable way? Here, normal_cdf
is the cumulative distribution function of the standard Normal distribution.
For example, in Python:
import scipy
from scipy.stats import norm
np.log(1 - norm.cdf(10))
gives -inf
with RuntimeWarning: divide by zero encountered in log
since norm.cdf(10)
is almost equal to 1
. Is there a function like logsumexp
that can avoid numerical under-flow?
Upvotes: 1
Views: 525
Reputation: 114841
@HongOoi's suggestion to take advantage of the symmetry is great. But for an arbitrary distribution in scipy.stats
(including norm
), you can use the method logsf
for exactly this computation. sf
stands for survival function, which is the name of the function 1 - cdf(x)
.
For example,
In [25]: import numpy as np
In [26]: from scipy.stats import norm, gamma
Here's an example of norm.logsf
:
In [27]: norm.logsf(3, loc=1, scale=1.5)
Out[27]: -2.3945773661586434
In [28]: np.log(1 - norm.cdf(3, loc=1, scale=1.5))
Out[28]: -2.3945773661586434
And here's an example of gamma.logsf
:
In [29]: gamma.logsf(1.2345, a=2, scale=1.8)
Out[29]: -0.16357333194167956
In [30]: np.log(1 - gamma.cdf(1.2345, a=2, scale=1.8))
Out[30]: -0.16357333194167956
This shows why one would want to use logsf(x)
instead of log(1 - cdf(x))
:
In [35]: norm.logsf(50, loc=1, scale=1.5)
Out[35]: -537.96178420294677
In [36]: np.log(1 - norm.cdf(50, loc=1, scale=1.5))
/Users/warren/miniconda3scipy/bin/ipython:1: RuntimeWarning: divide by zero encountered in log
#!/Users/warren/miniconda3scipy/bin/python
Out[36]: -inf
Upvotes: 2
Reputation: 57686
Since the normal distribution is symmetric about 0, we have
1 - F(x) = P(X > x)
= P(X < -x)
= F(-x)
Hence
np.log(1 - norm.cdf(10)) = np.log(norm.cdf(-10))
= norm.logcdf(-10)
Upvotes: 4