user27886
user27886

Reputation: 582

direct log probability increment (target += ...) in pymc3 as in stan

In the probabilistic programming language stan, given the data/params blocks:

data {
    int N;
    real[N] data;
}

params {
    real mu;
}

The following model blocks are equivalent:

"Sample notation":

model { 
    data ~ normal(mu, 1);
}

"Direct log probability increment notation":

model {
    for (n in 1:N)
        target += normal_lpdf(data[n] | mu, 1);
}

where target represents the total log density that is stochastically updated by an MCMC (NUTS) sampler. The benefit of using the latter notation is an increase in flexibility for how to define the log probability model, specifically one can provide samples the model generates through computation (in the example above data[n] but this can also be used in more contexts).

The sample notation can be applied in pymc3 (python) as:

with pm.Model() as model:
    mu = pm.Flat('mu')
    x = pm.Normal('x', mu=mu, sd=1.0, observed=data)

Question: How can I apply the the same direct log probability increment notation, where I specify the samples, in pymc3?

Upvotes: 1

Views: 312

Answers (1)

colcarroll
colcarroll

Reputation: 3682

You can run this using a pm.Potential, as in:

import numpy as np
import pymc3 as pm

data = 5 + 3 * np.random.randn(20)


with pm.Model() as model:
    x = pm.Flat('x')
    pm.Potential('y', pm.Normal.dist(x, 1).logp(data))

You can verify that this is doing the right thing via:

import scipy.stats as st

print(st.norm(0, 1).logpdf(data).sum(), model.logp({'x': 0}))

Note that Potential has to contain a theano expression -- pm.Normal.dist(x, 1) happens to be implemented in theano. You could also write the log pdf explicitly:

import theano.tensor as tt

with pm.Model() as model:
    x = pm.Flat('x')
    pm.Potential('y', -0.5 * tt.sum((x - data) ** 2) 
                     - 0.5 * len(data) * np.log(2 * np.pi))

Upvotes: 3

Related Questions