pythonhidden-markov-modelsmarkov-chainspymc

Reputation: 1246

PyMC: Hierachical Hidden Markov Model

This is a follow up on PyMC: Parameter estimation in a Markov system

I have a system which is defined by its position and velocity at each timestep. The behavior of the system is defined as:

vel = vel + damping * dt
pos =  pos + vel * dt

So, here is my PyMC model. To estimate vel, pos and most importantly damping.

# PRIORS
damping = pm.Normal("damping", mu=-4, tau=(1 / .5**2))
# we assume some system noise
tau_system_noise = (1 / 0.1**2)

# the state consist of (pos, vel); save in lists
# vel: we can't judge the initial velocity --> assume it's 0 with big std
vel_states = [pm.Normal("v0", mu=-4, tau=(1 / 2**2))]
# pos: the first pos is just the observation
pos_states = [pm.Normal("p0", mu=observations[0], tau=tau_system_noise)]

for i in range(1, len(observations)):
    new_vel = pm.Normal("v" + str(i),
                        mu=vel_states[-1] + damping * dt,
                        tau=tau_system_noise)
    vel_states.append(new_vel)
    pos_states.append(
        pm.Normal("s" + str(i),
                  mu=pos_states[-1] + new_vel * dt,
                  tau=tau_system_noise)
    )

# we assume some observation noise
tau_observation_noise = (1 / 0.5**2)
obs = pm.Normal("obs", mu=pos_states, tau=tau_observation_noise, value=observations, observed=True)

This is how I run the sampling:

mcmc = pm.MCMC([damping, obs, vel_states, pos_states])
mcmc.sample(50000, 25000)
pm.Matplot.plot(mcmc.get_node("damping"))
damping_samples = mcmc.trace("damping")[:]
print "damping -- mean:%f; std:%f" % (mean(damping_samples), std(damping_samples))
print "real damping -- %f" % true_damping

The value for damping is dominated by the prior. Even if I change the prior to Uniform or whatever, it is still the case.

What am I doing wrong? It's pretty much like the previous example, just with another layer.

The full IPython notebook of this problem is available here: http://nbviewer.ipython.org/github/sotte/random_stuff/blob/master/PyMC%20-%20HMM%20Dynamic%20System.ipynb

[EDIT: Some clarifications & code for sampling.]

[EDIT2: @Chris answer didn't help. I could not use AdaptiveMetropolis since the *_states don't seem to be part of the model.]

Upvotes: 1

Answers (4)

Aravind

Reputation: 136

I think that your initial approach is fine and should work, except that the "obs" variable has not been included in the list of nodes supplied to MCMC (see In[10] in your notebook). After including this variable, the MCMC solver runs fine and does enforce the conditional dependencies specified by your model. I'd like to repeat the point made by Chris that it is best to define the model in a different file or under a function to avoid such mistakes.

The reason why you don't get the right results, is that your priors have been chosen arbitrarily and in some cases, the values are such that it is very difficult for the model to mix properly in order to converge. Your toy problem tries to estimate a damping value such that the positions converge to vector of observed positions. For this, your model should have the flexibility to choose velocity and damping values in a wide range so that stochastic errors in the position/velocity can be corrected when going from one time step to the next. Otherwise, as a result of your Euler integration scheme, the errors just keep getting propagated. I think Chris referred to the same thing when he suggested choosing a more diffuse prior.

I suggest playing around with the tau values for each of the Normal variables. For instance, I changed the following values:

damping = pm.Normal("damping", mu=0, tau=1/20.**2) # was tau=1/2.**2

new_vel = pm.Normal("v" + str(i),
                    mu=vel_states[-1] + damping * dt,
                    tau=(1/2.**2)) # was tau=tau_system_noise=(1 / 0.5**2)

tau_observation_noise = (1 / 0.005**2) # was 1 / 0.5**2

You can see the modified file here.

The plots at the bottom show that the positions are indeed converging. The velocities are all over the place. The estimated mean value of damping is 6.9, which is very different from -1.5. Perhaps you can achieve better estimates by choosing appropriate values for the priors.

Upvotes: 0

Roger Stuckey

Reputation: 31

Assuming you know the structure of your model -- you are doing parameter estimation, not system identification -- you can construct your PyMC model as a regression, with unknown damping, initial position and initial velocity as parameters and the array of positions, your observations.

That is, with class PM representing the point-mass system:

pm = PM(true_damping)

positions, velocities = pm.integrate(true_pos, true_vel, N, dt)

# Assume little system noise
std_system_noise = 0.05
tau_system_noise = 1.0/std_system_noise**2

# Treat the real positions as observations
observations = positions + np.random.randn(N,)*std_system_noise

# Damping is modelled with a Uniform prior
damping = mc.Uniform("damping", lower=-4.0, upper=4.0, value=-0.5)

# Initial position & velocity unknown -> assume Uniform priors
init_pos = mc.Uniform("init_pos", lower=-1.0, upper=1.0, value=0.5)
init_vel = mc.Uniform("init_vel", lower=0.0, upper=2.0, value=1.5)

@mc.deterministic
def det_pos(d=damping, pi=init_pos, vi=init_vel):
    # Apply damping, init_pos and init_vel estimates and integrate
    pm.damping = d.item()
    pos, vel = pm.integrate(pi, vi, N, dt)
    return pos

# Standard deviation is modelled with a Uniform prior
std_pos = mc.Uniform("std", lower=0.0, upper=1.0, value=0.5)

@mc.deterministic
def det_prec_pos(s=std_pos):
    # Precision, based on standard deviation
    return 1.0/s**2

# The observations are based on the estimated positions and precision
obs_pos = mc.Normal("obs", mu=det_pos, tau=det_prec_pos, value=observations, observed=True)

# Create the model and sample
model = mc.Model([damping, init_pos, init_vel, det_prec_pos, obs_pos])
mcmc = mc.MCMC(model)
mcmc.sample(50000, 25000)

The full listing is here: https://gist.github.com/stuckeyr/7762371

Increasing N and decreasing dt will improve your estimates markedly.

Upvotes: 1

Chris Fonnesbeck

Reputation: 4203

There are a couple of issues with the model, looking at it again. First and foremost, you did not add all of your PyMC objects to the model. You have only added [damping, obs]. You should pass all of the PyMC nodes to the model.

Also, note that you don't need to call both Model and MCMC. This is fine:

model = pm.MCMC([damping, obs, vel_states, pos_states])

The best workflow for PyMC is to keep your model in a separate file from the running logic. That way, you can just import the model and pass it to MCMC:

import my_model

model = pm.MCMC(my_model)

Alternately, you can write your model as a function, returning locals (or vars), then calling the function as the argument for MCMC. For example:

def generate_model():

    # put your model definition here

    return locals()

model = pm.MCMC(generate_model())

Upvotes: 1

Chris Fonnesbeck

Reputation: 4203

What do you mean by unreasonable? Are they shrunken toward the prior? Damping seems to have a pretty tight variance -- what if you give it a more diffuse prior?

Also, you might try using the AdaptiveMetropolis sampler on the *_states arrays:

my_model.use_step_method(AdaptiveMetropolis, my_model.vel_states)

It sometimes mixes better for correlated variables, as these likely are.

Upvotes: 0

PyMC: Hierachical Hidden Markov Model

Answers (4)

Related Questions