Florian Piringer
Florian Piringer

Reputation: 1

compute an integral using scipy where the integrand is a product with parameters coming from a (arbitrarily long) list

I want to solve the coupon collector's problem in the general case (with varying possibilities for each coupon) using Flajolet's formula that I found on wikipedia (see https://en.wikipedia.org/wiki/Coupon_collector%27s_problem). According to the formula I have to compute an integral where the integrand is a product. I'm using scipy.integrad.quad and lambda-notation to integrate. Problem is that the number of factors in the integrand is not fixed (has parameters coming from a list). When I try to multiply the integrand factors I get an error, since I cannot multiply formal expressions, seemingly. But if I don't, I don't know to get the integration variable x in.

I found ways to integrate a product, if there are, for example, only 2 factors. And it doesn't seem to involve double integration or some such. Can anyone please help (I'm quite new to this stuff)?

import numpy as np
from scipy import integrate
....
def compute_general_case(p_list):
    integrand = 1
    for p in p_list:
        integrand_factor = lambda x: 1 - np.exp(-p * x)
        integrand *= integrand_factor
    integrand = 1 - integrand
    erg = integrate.quad(integrand, 0, np.inf)
    print(erg)

Upvotes: 0

Views: 71

Answers (2)

Erotemic
Erotemic

Reputation: 5238

Thanks for addressing this question: as a follow on to @Mstaino's answer, you don't even need the args as you can pass them into the function via a closure:

def coupon_collector_expected_samples(probs):
    """
    Find the expected number of samples before all "coupons" (with a
    non-uniform probability mass) are "collected".

    Args:
        probs (ndarray): probability mass for each unique item

    References:
        https://en.wikipedia.org/wiki/Coupon_collector%27s_problem
        https://www.combinatorics.org/ojs/index.php/eljc/article/view/v20i2p33/pdf
        https://stackoverflow.com/questions/54539128/scipy-integrand-is-product

    Example:
        >>> import numpy as np
        >>> import ubelt as ub
        >>> # Check EV of samples for a non-uniform distribution
        >>> probs = [0.38, 0.05, 0.36, 0.16, 0.05]
        >>> ev = coupon_collector_expected_samples(probs)
        >>> print('ev = {}'.format(ub.repr2(ev, precision=4)))
        ev = 30.6537

        >>> # Use general solution on a uniform distribution
        >>> probs = np.ones(4) / 4
        >>> ev = coupon_collector_expected_samples(probs)
        >>> print('ev = {}'.format(ub.repr2(ev, precision=4)))
        ev = 8.3333

        >>> # Check that this is the same as the solution for the uniform case
        >>> import sympy
        >>> n = len(probs)
        >>> uniform_ev = float(sympy.harmonic(n) * n)
        >>> assert np.isclose(ev, uniform_ev)
    """
    import numpy as np
    from scipy import integrate
    probs = np.asarray(probs)
    # Philippe Flajolet's generalized expected value integral
    def _integrand(t):
        return 1 - np.product(1 - np.exp(-probs * t))
    ev, abserr = integrate.quad(func=_integrand, a=0, b=np.inf)
    return ev

You can also see that my implementation is faster:

import timerit
ti = timerit.Timerit(100, bestof=10, verbose=2)

probs = np.random.rand(100)

from scipy import integrate
def orig_method(p_list):
    def integrand(x, *p_list):
        p_list = np.asarray(p_list)
        return 1 - np.product(1 - np.exp(-x * p_list))
    result, abserr = integrate.quad(integrand, 0, np.inf, args=p_list)
    return result

ti = timerit.Timerit(100, bestof=10, verbose=2)
for timer in ti.reset('orig_implementation'):
    with timer:
        orig_method(probs)

for timer in ti.reset('my_implementation'):
    with timer:
        coupon_collector_expected_samples(probs)

# Results:
# Timed orig_implementation for: 100 loops, best of 10
#     time per loop: best=7.267 ms, mean=7.954 ± 0.5 ms
# Timed my_implementation for: 100 loops, best of 10
#     time per loop: best=5.618 ms, mean=5.648 ± 0.0 ms

Upvotes: 0

Tarifazo
Tarifazo

Reputation: 4343

You can define the integration function with an arbitrary number of arguments, provided you pass them to quad using args=:

def integrand(x, *p_list):
    p_list = np.asarray(p_list)
    return 1 - np.product(1 - np.exp(-x * p_list))   #don't need to for-loop a product in numpy

result, abserr = quad(integrand, 0, np.inf, args=[1,1,1,1])
print(result, abserr)
>> 2.083333333333334 2.491001112400493e-10

For more information, see here

Upvotes: 2

Related Questions