flatronka
flatronka

Reputation: 1081

Custom Algorithm for Exp. maximization in Matlab

I try to write an algorithm which determine $\mu$, $\sigma$,$\pi$ for each class from a mixture multivariate normal distribution.

I finish with the algorithm partially, it works when I set the random guess values($\mu$, $\sigma$,$\pi$) near from the real value. But when I set the values far from the real one, the algorithm does not converge. The sigma goes to 0 $(2.30760684053766e-24 2.30760684053766e-24)$.

I think the problem is my covarience calculation, I am not sure that this is the right way. I found this on wikipediaenter image description here. I would be grateful if you could check my algorithm. Especially the covariance part.

Have a nice day, Thanks,

2 mixture gauss
size x  = [400, 2] (400 point 2 dimension gauss)
mu = 2 , 2 (1 row = first gauss mu, 2 row = second gauss mu)

    for i = 1 : k
        gaussEvaluation(i,:) = pInit(i) * mvnpdf(x,muInit(i,:), sigmaInit(i, :) * eye(d));
        gaussEvaluationSum = sum(gaussEvaluation(i, :));

        %mu calculation
        for j = 1 : d
            mu(i, j) = sum(gaussEvaluation(i, :) * x(:, j)) / gaussEvaluationSum;
        end
       %sigma calculation methode 1
       %for j = 1 : n 
        %    v = (x(j, :) - muNew(i, :));
        %    sigmaNew(i) = sigmaNew(i) + gaussEvaluation(i,j) * (v * v');
        %end
        %sigmaNew(i) = sigmaNew(i) / gaussEvaluationSum;

        %sigma calculation methode 2
        sub = bsxfun(@minus, x, mu(i,:));
        sigma(i,:) = sum(gaussEvaluation(i,:) * (sub .* sub)) / gaussEvaluationSum;

        %p calculation

        p(i) = gaussEvaluationSum / n;

Upvotes: 2

Views: 227

Answers (1)

Andrew Mao
Andrew Mao

Reputation: 36900

Two points: you can observe this even when you implement gaussian mixture EM correctly, but in your case, the code does seem to be incorrect.

First, this is just a problem that you have to deal with when fitting mixtures of gaussians. Sometimes one component of the mixture can collapse on to a single point, resulting in the mean of the component becoming that point and the variance becoming 0; this is known as a 'singularity'. Hence, the likelihood also goes to infinity.

Check out slide 42 of this deck: http://www.cs.ubbcluj.ro/~csatol/gep_tan/Bishop-CUED-2006.pdf

The likelihood function that you are evaluating is not log-concave, so the EM algorithm will not converge to the same parameters with different initial values. The link I gave above also gives some solutions to avoid this over-fitting problem, such as putting a prior or regularization term on your parameters. You can also consider running multiple times with different starting parameters and discarding any results with variance 0 components as having over-fitted, or just reduce the number of components you are using.

In your case, your equation is right; the covariance update calculation on Wikipedia is the same as the one on slide 45 of the above link. However, if you are in a 2d space, for each component the mean should be a length 2 vector and the covariance should be a 2x2 matrix. Hence your code (for two components) is wrong because you have a 2x2 matrix to store the means and a 2x2 matrix to store the covariances; it should be a 2x2x2 matrix.

Upvotes: 2

Related Questions