Farseer
Farseer

Reputation: 4172

Limits of quad integration in scipy with np.inf

Here an example with results:

I integrate through Gaussian distribution(mu=800, sigma=1) with ~ +-2sigma ppf and same integral from -inf to +inf. For some reason second integral results in zero, but in practice it should be more accurate.

Can someone explain, why such anomaly happens or where i made a mistake?

code:

from scipy.integrate import quad
import numpy as np
from scipy.stats import norm

def integrand(x):
    return x*norm.pdf(x, 800, 1)
print quad(integrand, norm.ppf(0.05, 800,1), norm.ppf(0.95, 800,1))
print quad(integrand, -np.inf, np.inf)

(719.9999999999894, 5.913323331834147e-11)
(0.0, 0.0) 

EDIT: By the way, when mean is small( for example 2) , it works fine - both integrals results are very close.

Upvotes: 4

Views: 2846

Answers (2)

B. M.
B. M.

Reputation: 18668

quad use heuristic algorithms, using adaptative step of integration to reduce time computing. Where the function is flat, it goes faster. so on big global interval, it can miss the peak.

You can help quad by suggesting "points of interest", to help him to find the regions with difficulties :

>>> quad(integrand,0,1000)
(3.8929062783235445e-32, 7.210678067622767e-32)
>>> quad(integrand,0,1000,points=[750])
(799.9999999999993, 2.0260999142842332e-07)

You can see the result of quad investigations with the full_output keyword :

>>>quad(integrand,0,1000,full_output=True)[2]['rlist'].max()
3.8929062783235445e-32

Here quad never select points where the integrand value exceed 1e-31, so it
infer that the function is null everywhere.

Upvotes: 1

Farseer
Farseer

Reputation: 4172

It seems that this behavior is expected.

When limits of integration is so huge - from -inf to inf, it is hard for quad to find relatively small bump at around 800, so the result of integral will be zero.

The reason it works when mean is small, is because the first split quad doing at the middle of given range which is zero, so when mean is small, integral will find the positive values at first split. (Probably it using some sort of trapezoid integration, i am not sure). So solution to this problem will be : Or give limits of integral a real numbers(making our search space much much smaller) Or making sure one of the splits will find positive values in range.(For example making sure that the mean is around zero).

Upvotes: 0

Related Questions