Mihai Soldan
Mihai Soldan

Reputation: 11

Figaro probabilistic programming parameter learning

I've currently started studying graphical probabilistic models. I've read the book "Practical Probabilistic Programming" by Avi Pfeffer about the Figaro probabilistic programming language. As an exercise I'm trying to learn the parameters of a normal distribution from a learning set. Anyway the data that I obtain isn't quite what is reasonable to expect.
I've defined a model where the normal distribution depends on 2 paramenters: The mean is another normal distribution with a mean of 50 and a variance of 0.01. The variance is a gamma distribution ok k=2 and theta=2.
I'm making 100 observation each with a value of 100. I'm inferring the mean and the variance with an Importance sampling algorithm. Here is the code

val mean : Element[Double] = Normal(50,0.01)
val variance: Element[Double] = Gamma(2,2)
val tripDistances = for(i<-Range(1,100)) yield Chain(mean, variance,(m:Double, v:Double) 
    => Normal(mean,variance))
for {t <- tripDistances} {t.observe(100) }
var importance = Importance(10000, mean,variance)
importance.start()
val expectedMeanVal = importance.computeExpectation(mean, (m: Double) => m)
val expectedVarianceVal = importance.computeExpectation(variance, (v: Double) => v)
importance.kill()
println("the mean = " + expectedMeanVal)
println("the variance = " + expectedVarianceVal)

Here is the output:

the mean = 49.905560193556994
the variance = 23.82362490526008

It is like the observation have no effect whatsoever on the probability distribution of the parameters. This is rather odd (surely i'm missing something), since i'm chaining the two elements (mean and variance) to create the normal distribution for which I then observe the actual values. I hope somebody could help me out. Thanks.

Upvotes: 1

Views: 154

Answers (1)

Kevin Li
Kevin Li

Reputation: 231

When you defined your prior on the Normal mean, you made the variance of your prior 0.01. This, intuitively, this means that you're very sure that the mean is very close to 50.

Aside from this, you might be interested in knowing that the inverse-Gamma distribution is a conjugate prior for the variance parameter of a Normal random variable provided that the mean is known. Similarly, the normal-gamma bivariate distribution is a conjugate prior for when both mean and variance are unknown.

The Wikipedia pages on these distributions will serve you well, and can verify that the posterior mean (with the hyperparameters that you have defined here) given that you observe 100 a hundred times is not so far from 50.

Upvotes: 2

Related Questions