Reputation: 253
I'm implementing a Markov Chain Monte Carlo with both Metropolis and Barker's α's for numerical integration. I've created a class called MCMCIntegrator()
. Below the __init__()
method, are the g(x)
the PDF of the function I'm integrating and alpha
method, implementing the Metropolis and Barker α's.
import numpy as np
import scipy.stats as st
class MCMCIntegrator:
def __init__(self):
self.size = 1000
self.std = 0.6
self.real_int = 0.06496359
self.sample = None
@staticmethod
def g(x):
return st.gamma.pdf(x, 1, scale=1.378008857)*np.abs(np.cos(1.10257704))
def alpha(self, a, b, method):
if method:
return min(1, self.g(b) / self.g(a))
else:
return self.g(b) / (self.g(a) + self.g(b))
The size
is the size of the sample that the class must generate, std
is the standard deviation of the normal kernel, which you will see in a few seconds. The real_int
is the value of the integral from 1 to 2 of the function we're integrating. I've generated it with a R script. Now, to the problem.
def _chain(self, method):
"""
Markov chain heat-up with burn-in
:param method: Metropolis or Barker alpha
:return: np.array containing the sample
"""
old = 0
sample = np.zeros(int(self.size * 1.3))
i = 0
while i != len(sample):
new = np.random.normal(loc=old, scale=self.std)
new = abs(new)
al = self.alpha(old, new, method=method)
u = np.random.uniform()
if al > u:
sample[i] = new
i += 1
old = new
return np.array(sample)
Below this method is an integrate()
method that calculates the proportion of numbers in the [1, 2] interval:
def integrate(self, method=None):
"""
Integration step
"""
sample = self._chain(method=method)
# discarding 30% of the sample for the burn-in
ind = int(len(sample)*0.3)
sample = sample[ind:]
setattr(self, "sample", sample)
sample = [1 if 1 < v < 2 else 0 for v in sample]
return np.mean(sample)
this is the main function:
def main():
print("-- RESULTS --".center(20), end='\n')
mcmc = MCMCIntegrator()
print(f"\t{mcmc.integrate()}", end='\n')
print(f"\t{np.abs(mcmc.integrate() - mcmc.real_int) / mcmc.real_int}")
if __name__ == "__main__":
main()
I'm stuck in an infinite while loop, with no idea why this is happening.
Upvotes: 1
Views: 429
Reputation: 862
While I have no prior exposure to Python and no direct explanation for the infinite loop, here are some problematic issues in the code:
The
while i != len(sample):
loop increments the value i
only when the Uniform variate is below the acceptance probability if al > u:
This is not how Metropolis-Hastings operates. If the Uniform variate is above the acceptance probability, the current value of the chain is duplicated. but this does not explain for the infinite loop since a proposed value should eventually be accepted.
If the target density is
st.gamma.pdf(x, 1, scale=1.378008857)*np.abs(np.cos(1.10257704))
then (i) what is the point of the second and constant term np.abs(np.cos(1.10257704))
and (ii) where is the need for so many digits?
The proposal distribution
new = np.random.normal(loc=old, scale=self.std)
new = abs(new)
is a folded normal, which density is not symmetric. Hence it should appear in the Metropolis-Hastings probability but may have little impact since the scale is small.
Here is my R rendering of the Python code (edited and corrected)
self.size = 1e5
self.std = 0.6
self.real_int = 0.06496359
g <- function(x){dgamma(x, shape=1, scale=1.378)}
alpha <- function(a, b, method=1)ifelse(method,
min(1, r <- g(b) / g(a)), 1 / (1 + 1 / r))
old = 0
smple = rep(0,self.size * 1.3)
for (i in 1:length(smple)){
new = abs(old+self.std*rnorm(1))
al = alpha(old, new, 0)
old=smple[i]=ifelse(al > runif(1), new, old)
}
ind = trunc(length(smple)*0.3)
smple = sample[ind:length(smple)]
hist(smple,pro=TRUE,nclass=10*log2(self.size),col="wheat")
curve(g(x),add=TRUE,lwd=2,col="sienna")
clearly reproducing the Gamma target:
without the correction for the non-symmetric proposal. The correction would be
q <- function(a, b)
dnorm(b-a,sd=self.std)+dnorm(-b-a,sd=self.std)
alpha <- function(a, b, method=1){
return(ifelse(method,
min(1, r <- g(b) * q(b,a) / g(a) / q(a,b)),
1 / (1 + 1/r)))}
old = 0
smple = rep(0,self.size * 1.3)
for (i in 1:length(smple)){
new = abs(old+self.std*rnorm(1))
al = alpha(old, new, 3)
old=smple[i]=ifelse(al > runif(1), new, old)
}
and makes no difference in the above picture. (The acceptance rate for the Metropolis ratio is 85%, while for Baker's, it is 48%.)
Upvotes: 2