Reputation: 35
Let’s say I have a dataset consisting of two subsets with binary observations. Subsets have the same proportions but different lengths. Based on a fixed Beta prior and those two binomial likelihood distributions, I want to generate two posterior distributions of p. Then, I will collect the mean p’s in an estimation set. Instead of building two different models, I want to fulfil all inferences under a single model. Below you see the PyMC3 model:
with pm.Model() as single_model:
data = [[1, 0, 1, 1, 0, 0, 0, 0], [0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1]]
estimations = []
p = pm.Beta('p', alpha=1, beta=1)
for i in data:
y = pm.Binomial('y', n=1, p=p, observed=i)
trace = pm.sample(500)
estimations.append(trace['p'].mean())
When I run the code, it raises an error like this:
ValueError: Variable name y already exists.
I tried to define a shape parameter within the likelihood like shape=len(data)
, but it did not work. How can I generate multiple estimations within the same model?
Upvotes: 0
Views: 33
Reputation: 2070
You can do
data = [1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1]
idx = np.repeat([0, 1], [7, 17])
with pm.Model() as single_model:
p = pm.Beta('p', alpha=1, beta=1, shape=2)
y = pm.Binomial('y', n=1, p=p[idx], observed=data)
idata = pm.sample()
or
data = [[1, 0, 1, 1, 0, 0, 0, 0], [0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1]]
with pm.Model() as single_model:
p = pm.Beta('p', alpha=1, beta=1, shape=2)
y0 = pm.Binomial('y0', n=1, p=p[0], observed=data[0])
y1 = pm.Binomial('y1', n=1, p=p[1], observed=data[1])
idata = pm.sample()
As a general rule don't use for-loops, use vectorized versions. Also, PyMC3 is no longer maintained, use PyMC instead.
Upvotes: 0