Reputation: 2600
I would like to use a Binomial distribution that is shifted by the parameter loc
(as in scipy
) in a pymc3 model.
E.g.:
with pm.Model() as m1:
prob = pm.Beta('prob',alpha=2,beta=2)
x = pm.Binomial('x',n=20,p=prob,loc=5)
But Binomial
does not allow for a shift parameter.
I tried to build it myself following the various tutorials on pymc3's website, but without success (I am very much a beginner in using pymc3 and theano). My latest attempt (probably very bad)
...
from scipy.stats import binom
class BinoShift(pm.Discrete):
def __init__(self, n, p, x, *args, **kwargs):
super(BinoShift, self).__init__(*args, **kwargs)
self.n = n
self.p = p
self.mode = np.round(n*p)
self.shift = x
def logp(self, value):
n = self.n
p = self.p
shift = self.shift
return binom.logpmf(value,n,p,loc=shift)
Backround: I have observations on random variable X = X_0 + z
where z
is a unobserved latent variable, X_0
is unobserved and Binomially distributed with (N-z,p
) with N
known. Final goal is to obtain a posterior distribution over p
and z
. This corresponds pretty much to a mixture model problem with unobserved cluster assignments. X \sim \sum_z p(z)(z + Bino(p,N-z))
. So if I had the binomial distribution with a shift parameter the pymc3 model that I envision looks something like
# generate data; kept simple here, but N and z may actually differ across sample
size = 500
N = 20
p = 0.7
z = 5
X = np.random.binomial(N-z,p,size=size) + z
with pm.Model() as mixture:
prob = pm.Beta('prob',alpha=2,beta=2)
weight = pm.Dirichlet('weight',a=np.array([1]*N))
comp = [pm.Binomial('X_{}'.format(i),n=N-i,p=prob,loc=i) for i in range(N)]
like = pm.Mixture('like',w=weight,comp_dists=comp,observed=X)
Other ways I tried to build this problem into a pymc3 model included a hierachical model with final line referring to the distribution of X_0
given the other paramters/unkowns, which is simply the Binomial distribution. But then I wouldn't how to pass as "observed" values, X-z. Another way I thought of was to first define the distributions of z
and X_0
and then use pm.Deterministic
for B
. But the deterministic class does not accept observed values (I guess as it would not know how to evaluate the likelihood).
Upvotes: 0
Views: 390
Reputation: 2600
Copy-pasted the source code by pymc3 and added the loc parameter (changes are marked):
import numpy as np
import theano.tensor as tt
from pymc3.distributions.dist_math import bound, binomln, logpow
from pymc3.math import tround
from pymc3.theanof import floatX, intX
from pymc3.distributions.distribution import Discrete
class BinoShift(Discrete):
def __init__(self, n, p, loc, *args, **kwargs): # <---
super().__init__(*args, **kwargs)
self.n = n = tt.as_tensor_variable(intX(n))
self.loc = loc = tt.as_tensor_variable(intX(loc)) # <---
self.p = p = tt.as_tensor_variable(floatX(p))
self.mode = tt.cast(tround(n * p), self.dtype)
def logp(self, value):
n = self.n
p = self.p
loc = self.loc # <---
k = value-loc # <---
return bound(
binomln(n, k) + logpow(p, k) + logpow(1 - p, n - k),
0 <= k, k <= n,
0 <= p, p <= 1) # <---
Upvotes: 1