user2007598
user2007598

Reputation: 163

Implementing Gaussian mixture MLE using optim() in R

I am trying to implement MLE for Gaussian mixtures in R using optim() using R's local datasets (Geyser from MASS). My code is below. The issue is that optim works fine however, returns the original parameters that I pass to it and also says that it has converged. I would appreciate if you can point out where I am going off track. My expectation is that it would at least yield different results if not wildly different.

library(ggplot2)
library(MASS)
data("geyser")
externaldata=geyser$waiting
x.vector=externaldata


MLE.x= function(combined.vector)
{ combined.vector=bigvec
  x.vector = externaldata
  K = k #capital K inside this MLE function, small K defined in the global environment
  prob.vector = combined.vector[1:K] 
  mu.vector =combined.vector[(K+1):((2*K))]
  sd.vector=combined.vector[(2*K+1):(3*K)]
  prob=matrix(rep(prob.vector,length(x.vector)),byrow=TRUE,nrow = length(x.vector))
  mu.sigma=cbind(mu.vector,sd.vector)
  x.by.K=matrix(nrow = length(x.vector), ncol = k)
  for (i in 1:K){
    x.by.K[,i]=dnorm(x.vector,mu.sigma[i,1],mu.sigma[i,2])
  }
  prob.mat=x.by.K*prob
  density=apply(prob.mat,1,sum)
  log.density=sum(-log(density))
  return(log.density)
}



## k=2 set ##
meanvec=c(50,80)
sigmavec=c(5,5)
k=2
probvec=c(1/3,2/3)
bigvec=c(probvec,meanvec,sigmavec)
est.k2.MLE=MLE.x(bigvec)
z=optim(bigvec,
        fn=MLE.x,
        method = "L-BFGS-B")
z


#### k=3 set #####
meanvec=c(50,70,80)
sigmavec=c(5,5,5)
k=3
probvec=rep(1/3,3)
bigvec=c(probvec,meanvec,sigmavec)
est.k3.MLE=MLE.x(bigvec)
z=optim(bigvec,
        fn=MLE.x,
        method = "BFGS")
z

Upvotes: 0

Views: 472

Answers (1)

asachet
asachet

Reputation: 6921

Remove the first line of the MLE.x function.

It will always return the same thing since its argument is replaced by the global variable "bigvec". So the MLE cannot converge, I think you rather hit the maximum iteration. You can check this by accessing z$convergence where z is the value returned by optim. This will be an integer code. 0 means that all is well, 1 indicates that the maximum number of iteration has been reached. Other values are different error codes.

But the code still doesn't run properly as you point out in comment. I couldn't see any mistake so I added the following snippet at the end of MLE.x:

if(any(is.na(density))) {
    browser()
  } else {
    log.density
  }

What it does is, if there are some NA (or NaN), we call browser() which is an insanely convenient debugging tool: it stops the code and brings up the console so that we can explore the environment. Else we return log.density.

Then I ran the code and, behold, when there is a NA in density, instead of failing, it now brings up the console:

You can see:

Browse[1]> head(x.by.K)
     [,1]       [,2]
[1,]  NaN 0.01032407
[2,]  NaN 0.01152576
[3,]  NaN 0.01183521
[4,]  NaN 0.01032407
[5,]  NaN 0.01107446
[6,]  NaN 0.01079706

The first column of x.by.K is NaN... So dnorm returns NaN...

Browse[1]> mu.sigma
     mu.vector sd.vector
[1,]  64.70180 -20.13726
[2,]  61.89559  33.34679

Here is the problem: SD of -20, that can't be good...

Browse[1]> combined.vector
[1] 1267.90677 1663.42604   64.70180   61.89559  -20.13726   33.34679

But this is the input of MLE.x.

There, I've just showed you how I debug my code :)

So what's happening is that during the optimization routine, parameters 5 and 6 take negative values, which causes dnorm to fail. Why wouldn't they be negative? Optim doesn't know that these are supposed to stay positive!

So you must find a way to do a constrained optimization whith the constraint that SD>0.

But you actually shouldn't do that and rather think about what you want to do because I am not quite sure why you want to fit univariate Gaussian.

Upvotes: 2

Related Questions