efficiently calculating double integral

Question

I calculating following integral using scipy:

from scipy.stats import norm
def integrand(y, x):
    # print "y: %s  x: %s" % (y,x)
    return (du(y)*measurment_outcome_belief(x, 3)(y))*fv_belief(item.mean, item.var)(x)
    return dblquad(
        integrand, norm.ppf(0.001, item.mean, item.var), 
        norm.ppf(0.999, item.mean, item.var),
        lambda x: norm.ppf(0.001, x, 3), 
        lambda x: norm.ppf(0.999, x, 3))[0]

I have belief state of item (normal distribution) and measurement conditional on real value of item.(also normal distributed) Using this integral i calculate value of information(how useful to measure this item).

The problem that, calculating this integral takes lots of time. Is there perhaps more efficient way to calculate it( I dont need 100% precision) , like monte carlo integration or something similar?

I know there is skmonaco library in python for monte carlo integration, but limits of integral has to be numbers, unlike in scipy, inner integral limits, depends on outer(for example from above

lambda x: norm.ppf(0.001, x, 3)

) Here how double integral calculated using skmonaco

>>> from skmonaco import mcquad
>>> mcquad(lambda x_y: x_y[0]*x_y[1], # integrand
...     xl=[0.,0.],xu=[1.,1.], # lower and upper limits of integration
...     npoints=100000 # number of points
...     )

As you can see limits of inner integral here not dependent on outer integral. Can someone recommend library or way to calculate this integral efficiently?

Pascal Bugnion · Accepted Answer

The easiest way to deal with non-cubic integration volumes in scikit-monaco is to re-define your integration function so that it returns 0 outside of the region of integration (see this section of the documentation):

def modified_integrand(xs):
    x, y = xs
    if norm.ppf(0.001, x, 3) < y < norm.ppf(0.999, x, 3):
        return integrand(xs) # the actual integrand
    else:
        return 0.0

As Ami Tavori says, this will be quite inefficient if large sections of your integration region are zero. To mediate this, you can use the MISER or VEGAS algorithms: both these algorithms 'learn' the shape of the integral as they run to distribute points more efficiently in regions of interest.

That said, your region of integration is basically a rotated rectangle:

import matplotlib.pyplot as plt
from scipy.stats import norm
import numpy as np

xs = np.linspace(-10, 10)
ys = np.linspace(-10, 10)

# Plot integration domain
# Red regions are in the domain of integration, blue
# regions are outside
plt.contourf(xs, ys, 
     [ [ 1 if norm.ppf(0.001, x, 3) < y < norm.ppf(0.999, x, 3) 
         else 0 for x in xs ] for y in ys ])

Integration domain

In this case, you would be much better off rotating the integration coordinates. For instance, defining

r = x - y
R = (x + y)/2.0

Your integrand is then:

def rotated_integrand(rs):
    R, r = rs
    x = R + r/2.0
    y = R - r/2.0
    return integrand(np.array([x,y]))

The integration limits along r are now a constant (-9.27..9.27, in the example you've given). The integration limit along R is still (-inf, inf), so you will need to increase the integration region along R gradually until your integral converges. I would definitely recommend using the MISER algorithm for this (mcmiser in scikit-monaco), rather than uniform sampling.

Finally, judging by the names of the functions you have used, it looks like you're doing some form of Bayesian updating. If that's the case, you might consider using the PyMC library for Markov Chain Monte Carlo, which might be more suitable than a general purpose MC integration library.

efficiently calculating double integral

Answers (2)

Related Questions