Reputation: 51
I am struggling with the following problem. In a nutshell: Two different software packages (Gauss by Aptech and R) yield totally different Hessian Matrices in a Maximum Liklihood Procedure. I am using the same procedure (BFGS), the exact same data, the same maximum likelihood formula (it is a very simple logit model) with the exact same starting values and confusingly, I get the same results for the parameters and the log-likelihood. Only the Hessian matrices are different accross both programs and therefore, the estimation of the standard errors and statistical inference differs.
It does not appear much deviation in this specific example, but every increasing complication of the model increases the difference, so if I try to estimate my final model, both programs yield completely off results.
Does anyone know, how both programs differ in the way they compute the Hessian and possibly the right way to optaining the same results?
EDIT: In the R (Gauss) code, vector X (alt) is the independent variable, consisting of a two-colum vector with column one being entirely ones and the second column the subjects' responses. Vector y (itn) is the dependent variable, consisting of one columns with the subjects' responses. The example (R Code and data set) has been taken from http://www.polsci.ucsb.edu/faculty/glasgow/ps206/ps206.html, just as an example to reproduce and isolate the problem.
I have attached both codes (Gauss and R syntax) and outputs.
Any help would be greatly appreciated. Thank you :)
Gauss:
start={ 0.95568840 , -0.20459156 };
library maxlik,pgraph;
maxset;
_max_Algorithm = 2;
_max_Diagnostic = 1;
{betaa,f,g,cov,ret} = maxlik(XMAT,0,&ll,start);
call maxprt(betaa,f,g,cov,ret);
print _max_FinalHess;
proc ll(b,XMAT);
local exb, probo, logexb, yn, logexbn, yt, ynt, logl;
exb = EXP(alt*b);
//print exb;
probo = exb./(1+exb);
logexb = ln(probo);
yn = 1 - itn;
logexbn = ln(1 - probo);
yt = itn';
ynt = yn';
logl = (yt*logexb + ynt*logexbn);
print(logl);
retp(logl);
endp;
R:
startv <- c(0.95568840,-0.20459156)
logit.lf <- function(beta) {
exb <- exp(X%*%beta)
prob1 <- exb/(1+exb)
logexb <- log(prob1)
y0 <- 1 - y
logexb0 <- log(1 - prob1)
yt <- t(y)
y0t <- t(y0)
logl <- -(yt%*%logexb + y0t%*%logexb0)
return(logl)
}
logitmodel <- optim(startv, logit.lf, method="BFGS", control=list(trace=TRUE, REPORT=1), hessian=TRUE)
logitmodel$hessian
Gauss Output:
return code = 0
normal convergence
Mean log-likelihood -0.591820
Number of cases 1924
Covariance matrix of the parameters computed by the following method:
Inverse of computed Hessian
Parameters Estimates Std. err. Est./s.e. Prob. Gradient
------------------------------------------------------------------
P01 2.1038 0.2857 7.363 0.0000 0.0000
P02 -0.9984 0.2365 -4.221 0.0000 0.0000
Gauss Hessian:
0.20133256 0.23932571
0.23932571 0.29377761
R Output:
initial value 1153.210839
iter 2 value 1148.015749
iter 3 value 1141.420328
iter 4 value 1138.668174
iter 5 value 1138.662148
iter 5 value 1138.662137
iter 5 value 1138.662137
final value 1138.662137
converged
Coeff. Std. Err. z p value
[1,] 2.10379869 0.28570765 7.3634665 1.7919000e-13
[2,] -0.99837955 0.23651060 -4.2212889 2.4290942e-05
R Hessian:
[,1] [,2]
[1,] 387.34106 460.45379
[2,] 460.45379 565.24412
Upvotes: 5
Views: 1108
Reputation: 11
They are just scaled differently. The GAUSS numbers are around 1924 times smaller than the R numbers.
I think GAUSS keeps the numbers in a smaller range for numerical stability.
Upvotes: 0