Reputation: 11637
The following model is part of the tutorial of PyMC, named disaster_model.py and can be imported in the main code to be used as a model:
"""
A model for the disasters data with a changepoint
changepoint ~ U(0, 110)
early_mean ~ Exp(1.)
late_mean ~ Exp(1.)
disasters[t] ~ Po(early_mean if t <= switchpoint, late_mean otherwise)
"""
from pymc import *
from numpy import array, empty
from numpy.random import randint
__all__ = ['disasters_array', 'switchpoint', 'early_mean', 'late_mean', 'rate', 'disasters']
disasters_array = array([ 4, 5, 4, 0, 1, 4, 3, 4, 0, 6, 3, 3, 4, 0, 2, 6,
3, 3, 5, 4, 5, 3, 1, 4, 4, 1, 5, 5, 3, 4, 2, 5,
2, 2, 3, 4, 2, 1, 3, 2, 2, 1, 1, 1, 1, 3, 0, 0,
1, 0, 1, 1, 0, 0, 3, 1, 0, 3, 2, 2, 0, 1, 1, 1,
0, 1, 0, 1, 0, 0, 0, 2, 1, 0, 0, 0, 1, 1, 0, 2,
3, 3, 1, 1, 2, 1, 1, 1, 1, 2, 4, 2, 0, 0, 1, 4,
0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1])
# Define data and stochastics
switchpoint = DiscreteUniform('switchpoint', lower=0, upper=110, doc='Switchpoint[year]')
early_mean = Exponential('early_mean', beta=1.)
late_mean = Exponential('late_mean', beta=1.)
@deterministic(plot=False)
def rate(s=switchpoint, e=early_mean, l=late_mean):
''' Concatenate Poisson means '''
out = empty(len(disasters_array))
out[:s] = e
out[s:] = l
return out
disasters = Poisson('disasters', mu=rate, value=disasters_array, observed=True)
Now one can do a sampling from distributions using MCMC Metropolis Hasting algorithm to get posterior distribution of parameters.
from pymc.examples import disaster_model
from pymc import MCMC
M = MCMC(disaster_model)
M.sample(iter=10000, burn=1000, thin=10)
Now my problem is that suppose after this sampling I achieve new data. How can I update my posterior distributions afterwards? Basically how can implement online learning using PyMC?
Upvotes: 4
Views: 1658
Reputation: 4203
You would need to specify a new model for the update. The reason for this is that now you will have informative priors to use for the unknown parameters. Specifically, your DiscreteUniform
on the switchpoint will either be a Categorical or a Multinomial (with n=1), and the rate parameters might both be normally distributed. You could fit these priors (using one of several methods) to the posterior samples from the first run of the model. If you planned updating repeatedly, you could easily do this update programatically.
Upvotes: 1