Reputation: 1
I want to solve the coupon collector's problem in the general case (with varying possibilities for each coupon) using Flajolet's formula that I found on wikipedia (see https://en.wikipedia.org/wiki/Coupon_collector%27s_problem). According to the formula I have to compute an integral where the integrand is a product. I'm using scipy.integrad.quad and lambda-notation to integrate. Problem is that the number of factors in the integrand is not fixed (has parameters coming from a list). When I try to multiply the integrand factors I get an error, since I cannot multiply formal expressions, seemingly. But if I don't, I don't know to get the integration variable x in.
I found ways to integrate a product, if there are, for example, only 2 factors. And it doesn't seem to involve double integration or some such. Can anyone please help (I'm quite new to this stuff)?
import numpy as np
from scipy import integrate
....
def compute_general_case(p_list):
integrand = 1
for p in p_list:
integrand_factor = lambda x: 1 - np.exp(-p * x)
integrand *= integrand_factor
integrand = 1 - integrand
erg = integrate.quad(integrand, 0, np.inf)
print(erg)
Upvotes: 0
Views: 71
Reputation: 5238
Thanks for addressing this question: as a follow on to @Mstaino's answer, you don't even need the args
as you can pass them into the function via a closure:
def coupon_collector_expected_samples(probs):
"""
Find the expected number of samples before all "coupons" (with a
non-uniform probability mass) are "collected".
Args:
probs (ndarray): probability mass for each unique item
References:
https://en.wikipedia.org/wiki/Coupon_collector%27s_problem
https://www.combinatorics.org/ojs/index.php/eljc/article/view/v20i2p33/pdf
https://stackoverflow.com/questions/54539128/scipy-integrand-is-product
Example:
>>> import numpy as np
>>> import ubelt as ub
>>> # Check EV of samples for a non-uniform distribution
>>> probs = [0.38, 0.05, 0.36, 0.16, 0.05]
>>> ev = coupon_collector_expected_samples(probs)
>>> print('ev = {}'.format(ub.repr2(ev, precision=4)))
ev = 30.6537
>>> # Use general solution on a uniform distribution
>>> probs = np.ones(4) / 4
>>> ev = coupon_collector_expected_samples(probs)
>>> print('ev = {}'.format(ub.repr2(ev, precision=4)))
ev = 8.3333
>>> # Check that this is the same as the solution for the uniform case
>>> import sympy
>>> n = len(probs)
>>> uniform_ev = float(sympy.harmonic(n) * n)
>>> assert np.isclose(ev, uniform_ev)
"""
import numpy as np
from scipy import integrate
probs = np.asarray(probs)
# Philippe Flajolet's generalized expected value integral
def _integrand(t):
return 1 - np.product(1 - np.exp(-probs * t))
ev, abserr = integrate.quad(func=_integrand, a=0, b=np.inf)
return ev
You can also see that my implementation is faster:
import timerit
ti = timerit.Timerit(100, bestof=10, verbose=2)
probs = np.random.rand(100)
from scipy import integrate
def orig_method(p_list):
def integrand(x, *p_list):
p_list = np.asarray(p_list)
return 1 - np.product(1 - np.exp(-x * p_list))
result, abserr = integrate.quad(integrand, 0, np.inf, args=p_list)
return result
ti = timerit.Timerit(100, bestof=10, verbose=2)
for timer in ti.reset('orig_implementation'):
with timer:
orig_method(probs)
for timer in ti.reset('my_implementation'):
with timer:
coupon_collector_expected_samples(probs)
# Results:
# Timed orig_implementation for: 100 loops, best of 10
# time per loop: best=7.267 ms, mean=7.954 ± 0.5 ms
# Timed my_implementation for: 100 loops, best of 10
# time per loop: best=5.618 ms, mean=5.648 ± 0.0 ms
Upvotes: 0
Reputation: 4343
You can define the integration function with an arbitrary number of arguments, provided you pass them to quad
using args=
:
def integrand(x, *p_list):
p_list = np.asarray(p_list)
return 1 - np.product(1 - np.exp(-x * p_list)) #don't need to for-loop a product in numpy
result, abserr = quad(integrand, 0, np.inf, args=[1,1,1,1])
print(result, abserr)
>> 2.083333333333334 2.491001112400493e-10
For more information, see here
Upvotes: 2