Reputation: 18521
I've been using scipy.special.erfcinv
to calculate Z scores from pvalues. However, when the pvalues become very small, erfcinv
gets unexpectedly large. Any ideas?
Example:
In [1]: import numpy as np
In [2]: from scipy.special import erfcinv
In [3]: erfcinv(2e-16) * np.sqrt(2)
Out[3]: 8.2095361516013874
In [4]: erfcinv(1e-16) * np.sqrt(2)
Out[4]: 1.7976931348623155e+308
I'm running python 2.6.6 with scipy 0.10.1.
Upvotes: 1
Views: 403
Reputation: 22897
If you want to calculate quantiles or p-values, the you could use the stats.distributions which find in most cases the most appropriate special function, in this case I always use stats.norm.isf for the upper tail pvalue (also because I don't want to remember what erf or erfcinv is).
>>> for y in 10.**(-np.arange(5, 30, 2)):
print y, special.erfcinv(y) * np.sqrt(2), stats.norm.isf(y/2.), -special.ndtri(y/2)
1e-05 4.41717341347 4.41717341347 4.41717341347
1e-07 5.32672388628 5.32672388638 5.32672388638
1e-09 6.10941019166 6.10941020487 6.10941020487
1e-11 6.80650247883 6.80650249074 6.80650249074
1e-13 7.4410077655 7.44090215064 7.44090215064
1e-15 8.01401594878 8.02685888253 8.02685888253
1e-17 inf 8.57394407672 8.57394407672
1e-19 inf 9.08895010083 9.08895010083
1e-21 inf 9.57690145543 9.57690145543
1e-23 inf 10.0416376122 10.0416376122
1e-25 inf 10.4861701796 10.4861701796
1e-27 inf 10.9129127092 10.9129127092
1e-29 inf 11.3238345582 11.3238345582
The floating point problems pointed out by zero323 will still show up in many other cases.
Upvotes: 3
Reputation: 330163
Short answer: the closer you get to the limits of floating point arithmetic precision the weirder things happens: (http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html, http://www.seas.ucla.edu/~vandenbe/103/lectures/flpt.pdf)
A little bit longer. First lets look at erfcinv function:
def erfcinv(y):
return ndtri((2-y)/2.0)/sqrt(2)
If we take y = 2e-16:
In [96]: (2 - 2e-16) / 2
Out[96]: 0.9999999999999999
When we take y = 1e-16:
In [97]: (2 - 1e-16) / 2
Out[97]: 1.0
Now we look at ndtri:
x=ndtri(y) returns the argument x for which the area udnder the
Gaussian probability density function (integrated from minus infinity
to x) is equal to y.
Now everything should be clear, am I right? As you can suspect:
In [99]: ndtri(1)
Out[99]: inf
Your results can be a little bit different - in my case:
In [101]: erfcinv(1e-16) * np.sqrt(2)
Out[101]: inf
Upvotes: 3