Occhima
Occhima

Reputation: 253

Markov Chain Monte Carlo integration and infinite while loop

I'm implementing a Markov Chain Monte Carlo with both Metropolis and Barker's α's for numerical integration. I've created a class called MCMCIntegrator(). Below the __init__() method, are the g(x) the PDF of the function I'm integrating and alpha method, implementing the Metropolis and Barker α's.

import numpy as np
import scipy.stats as st

class MCMCIntegrator:

    def __init__(self):
        self.size = 1000
        self.std = 0.6
        self.real_int = 0.06496359
        self.sample = None

    @staticmethod
    def g(x):
        return st.gamma.pdf(x, 1, scale=1.378008857)*np.abs(np.cos(1.10257704))

    def alpha(self, a, b, method):

        if method:
            return min(1, self.g(b) / self.g(a))

        else:
            return self.g(b) / (self.g(a) + self.g(b))

The size is the size of the sample that the class must generate, std is the standard deviation of the normal kernel, which you will see in a few seconds. The real_int is the value of the integral from 1 to 2 of the function we're integrating. I've generated it with a R script. Now, to the problem.

     def _chain(self, method):

        """
            Markov chain heat-up with burn-in

        :param method: Metropolis or Barker alpha
        :return: np.array containing the sample
        """
        old = 0
        sample = np.zeros(int(self.size * 1.3))
        i = 0

        while i != len(sample):
            new = np.random.normal(loc=old, scale=self.std)
            new = abs(new)
            al = self.alpha(old, new, method=method)
            u = np.random.uniform()

            if al > u:
                sample[i] = new
                i += 1
                old = new

        return np.array(sample)

Below this method is an integrate() method that calculates the proportion of numbers in the [1, 2] interval:

    def integrate(self, method=None):
        """
            Integration step

        """

        sample = self._chain(method=method)

        # discarding 30% of the sample for the burn-in
        ind = int(len(sample)*0.3)
        sample = sample[ind:]
        setattr(self, "sample", sample)

        sample = [1 if 1 < v < 2 else 0 for v in sample]
        return np.mean(sample)

this is the main function:

def main():

    print("-- RESULTS --".center(20), end='\n')
    mcmc = MCMCIntegrator()
    print(f"\t{mcmc.integrate()}", end='\n')
    print(f"\t{np.abs(mcmc.integrate() - mcmc.real_int) / mcmc.real_int}")


if __name__ == "__main__":
    main()

I'm stuck in an infinite while loop, with no idea why this is happening.

Upvotes: 1

Views: 429

Answers (1)

While I have no prior exposure to Python and no direct explanation for the infinite loop, here are some problematic issues in the code:

  1. The

     while i != len(sample):
    

    loop increments the value i only when the Uniform variate is below the acceptance probability if al > u: This is not how Metropolis-Hastings operates. If the Uniform variate is above the acceptance probability, the current value of the chain is duplicated. but this does not explain for the infinite loop since a proposed value should eventually be accepted.

  2. If the target density is

    st.gamma.pdf(x, 1, scale=1.378008857)*np.abs(np.cos(1.10257704))  
    

    then (i) what is the point of the second and constant term np.abs(np.cos(1.10257704)) and (ii) where is the need for so many digits?

  3. The proposal distribution

       new = np.random.normal(loc=old, scale=self.std)
       new = abs(new)
    

    is a folded normal, which density is not symmetric. Hence it should appear in the Metropolis-Hastings probability but may have little impact since the scale is small.

Here is my R rendering of the Python code (edited and corrected)

self.size = 1e5
self.std = 0.6
self.real_int = 0.06496359

g <- function(x){dgamma(x, shape=1, scale=1.378)}

alpha <- function(a, b, method=1)ifelse(method,
            min(1, r <- g(b) / g(a)), 1 / (1 + 1 / r))

old = 0 
smple = rep(0,self.size * 1.3)
for (i in 1:length(smple)){
            new = abs(old+self.std*rnorm(1))
            al = alpha(old, new, 0)
            old=smple[i]=ifelse(al > runif(1), new, old)
             }

ind = trunc(length(smple)*0.3)
smple = sample[ind:length(smple)]

hist(smple,pro=TRUE,nclass=10*log2(self.size),col="wheat")
curve(g(x),add=TRUE,lwd=2,col="sienna")

clearly reproducing the Gamma target:

enter image description here

without the correction for the non-symmetric proposal. The correction would be

q <- function(a, b)
        dnorm(b-a,sd=self.std)+dnorm(-b-a,sd=self.std)

alpha <- function(a, b, method=1){
        return(ifelse(method,
            min(1, r <- g(b) * q(b,a) / g(a) / q(a,b)),
            1 / (1 + 1/r)))}

old = 0 
smple = rep(0,self.size * 1.3)
for (i in 1:length(smple)){
            new = abs(old+self.std*rnorm(1))
            al = alpha(old, new, 3)
            old=smple[i]=ifelse(al > runif(1), new, old)
             }

and makes no difference in the above picture. (The acceptance rate for the Metropolis ratio is 85%, while for Baker's, it is 48%.)

Upvotes: 2

Related Questions