develarist
develarist

Reputation: 1365

Differential entropy is calculated with integrate.quad in scipy.stats?

scipy.stats.entropy calculates the differential entropy for a continuous random variable. By which estimation method, and which formula, exactly is it calculating differential entropy? (i.e. the differential entropy of a norm distribution versus that of the beta distribution)

Below is its github code. Differential entropy is the negative integral sum of the p.d.f. multiplied by the log p.d.f., but nowhere do I see this or the log written. Could it be in the call to integrate.quad?

def _entropy(self, *args):
    def integ(x):
        val = self._pdf(x, *args)
        return entr(val)

    # upper limit is often inf, so suppress warnings when integrating
    _a, _b = self._get_support(*args)
    with np.errstate(over='ignore'):
        h = integrate.quad(integ, _a, _b)[0]

    if not np.isnan(h):
        return h
    else:
        # try with different limits if integration problems
        low, upp = self.ppf([1e-10, 1. - 1e-10], *args)
        if np.isinf(_b):
            upper = upp
        else:
            upper = _b
        if np.isinf(_a):
            lower = low
        else:
            lower = _a
        return integrate.quad(integ, lower, upper)[0]

Source (lines 2501 - 2524): https://github.com/scipy/scipy/blob/master/scipy/stats/_distn_infrastructure.py

Upvotes: 2

Views: 1983

Answers (2)

phipsgabler
phipsgabler

Reputation: 20950

You have to store a continuous random variable in some parametrized way anyway, unless you work with an approximation. In that case, you usually work with distribution objects; and for known distributions, formulae for the differential entropy in terms of the parameters exist.

Scipy accordingly provides an entropy method for rv_continuous that calculates the differential entropy where possible:

In [5]: import scipy.stats as st                                                                                                                             

In [6]: rv = st.beta(0.5, 0.5)                                                                                                                               

In [7]: rv.entropy()                                                                                                                                         
Out[7]: array(-0.24156448)

Upvotes: 1

david.a.
david.a.

Reputation: 19

The actual question here is how do you store a continuous variable in memory. You might use some discretization techniques and calculate entropy for a discrete random variable.

You also may check Tensorflow Probability, which treats distributions essentially as tensors and has a method entropy() for a Distribution class.

Upvotes: 0

Related Questions