Bach
Bach

Reputation: 6227

ppf(0) of scipy's randint(0, 2) is -1.0

Perhaps I don't understand the functionality of .ppf() well, but according to wikipedia, ppf(q) should return the infimum over all reals x for which q <= cdf(x). Since for every x the cdf of any distribution is non-negative, I'd expect ppf(0) to return -inf. However, as it seems

scipy.stats.randint(0, 2).ppf(0)  ## returns -1.0 ..?

Any idea about the cause for this behaviour?

Upvotes: 2

Views: 188

Answers (1)

Robert Dodier
Robert Dodier

Reputation: 17585

You are correct -- randint.ppf is implemented in a not-so-careful way. Here is the code for cdf and ppf in scipy/stats/distributions.py (from scipy 0.9.0):

def _cdf(self, x, min, max):
    k = floor(x)
    return (k-min+1)*1.0/(max-min)
def _ppf(self, q, min, max):
    vals = ceil(q*(max-min)+min)-1
    vals1 = (vals-1).clip(min, max)
    temp = self._cdf(vals1, min, max)
    return where(temp >= q, vals1, vals)

As you can see, when q = 0 that will return -1 from ppf. Note also that ppf(0.01) = 0 (should be -inf) and ppf(0.51) = 1 (should be 0).

This ppf is really broken -- or perhaps it is more charitable to say it was written without consideration for any strict definition. The available documentation says that ppf is the "inverse of cdf" but of course that makes no sense when cdf is not 1-to-1.

Upvotes: 2

Related Questions