Reputation: 43
Wwhen I ran this following sas code, it says, optimization cannot be completed. One of the problem, I can guess, I have used wide format data, which is not good for proc NLMIXED , can anyone help me how in the following code/problem I can use, long format data, I have to use do loop, however, how I can do it with long format data.
proc nlmixed
data=new method=GAUSS NOAD qpoints=50 tech=newrap maxit=5000
;
array y{*} y1-y5;
array Time{*} Time1-Time5;
parms b1=0.95 sdn1=0.95 sdC=0.90 fi=0.45;
bounds sdn1 >= 0, sdC >= 0;
pi=constant('pi');
L1=log(pi)-((pi*(y[1]-b1*Time[1] + bD))/(sdn1*sqrt(3)))-log(sdn1)-0.5*log(3)-2*log(1+exp(-((pi*(y[1]-b1*Time[1] + bD))/(sdn1*sqrt(3)))));
L2=1;
do j=2 to 5;
L2=L2*(log(pi)-((pi*(y[j]-b1*Time[j]+y[j-1]*fi+ bD))/(sdn1*sqrt(3*(1-fi*fi))))-log(sdn1)-0.5*log(3*(1-fi*fi))-2*log(1+exp(-((pi*(y[j]-b1*Time[j]+y[j-1]*fi+ bD))/(sdn1*sqrt(3*(1-fi*fi)))))));
end;
L=L1*L2;dum=1;
model dum ~ general(L);
random bD ~ normal(0,sdC*sdC) subject=ID;
run;
Upvotes: 0
Views: 267
Reputation: 46
I have a few of points. First, when you have a complex likelihood function that seems to be giving you problems, you can often determine the source of difficulties by trying to construct the likelihood function in a data step. You might be trying to exponentiate a value that results in an overflow or you might have a condition where you attempt to comput the logarithm or square root of a negative number. Error reporting is often much more informative about such issues when you attempt to construct the (log) likelihood in data step code. Note that the purpose of writing data step code is not for maximizing the likelihood, but evaluating the likelihood given some specified set of parameters.
There are some modifications that must be made in order to convert NLMIXED code to data step code. First, the NLMIXED statement (with options) must be converted to a DATA statement. A DATA NULL; statement is typically sufficient. Second, the PARMS statement is not supported in a DATA step. That can be converted to a RETAIN statement, often with some modification of the code. (The PARMS statement allows an equal sign between parameter and value. The RETAIN statement does not allow the equal sign. In addition, you can specify multiple initial conditions for a single parameter with the PARMS statement. The RETAIN statement would not support multiple value specifications for a single variable.) Other statements often occur in NLMIXED code that are not available in the DATA step. In your code, the BOUNDS, MODEL, and RANDOM statements need to be commented out. Unfortunately, random effects identified on a RANDOM statement must disappear from the model or some value must be given to each random effect.
Now, one of the statements mentioned above as not supported in DATA step code can (should!) usually be done away with in NLMIXED code as well. Specifically, the BOUNDS statement can cause all kinds of problems during NLMIXED execution. Moreover, the model can almost always be parameterized in such a way that a BOUNDS statement is not needed. Since you have two parameters (sdn1 and sdC) that you desire to be non-negative, you can write the model in terms of the logarithms of the two parameters (log_sdn1 and log_sdC). You can then construct the variables sdn1=exp(log_sdn1) and sdC=exp(log_sdC) immediately after the PARMS (or RETAIN) statement and then reference sdn1 and sdC in your likelihood specification. Of course, you allow sdn1 and sdC to equal 0 in your BOUNDS statement. When exponentiating log_sdn1 and log_sdC, you can only achieve an (infinitely close) approximation of a 0 value.
I constructed a data set consisting of the first two records that you display above and then made modifications to the NLMIXED code to convert it to DATA step code as indicated above. See below:
data new;
input id y1-y5;
cards;
1 0.3393279811 2.9297994781 3.3204747713 5.712911709 7.303402636
2 0.6299412759 0.5226832086 2.6025372538 4.6200632561 6.3933354199
;;;
/*proc nlmixed
data=new method=GAUSS NOAD qpoints=50 tech=newrap maxit=5000;*/
data _null_;
set new;
array y{*} y1-y5;
array Time{*} Time1-Time5;
do i=1 to 5;
Time{i} = i;
end;
/* Change PARMS statement to RETAIN statement and express sdn1 and sdC differently */
/*parms b1=0.95 sdn1=0.95 sdC=0.90 fi=0.45;*/
retain b1 0.95
log_sdn1 %sysfunc(log(0.95))
log_sdC %sysfunc(log(0.90))
fi 0.45;
/*bounds sdn1 >= 0, sdC >= 0;*/
sdn1 = exp(log_sdn1);
sdC = exp(log_sdC);
pi=constant('pi');
/* Add bD=0 to data step code only so that there is some value specified for the random effect. */
bD=0;
L1 = log(pi) -
((pi*(y[1]-b1*Time[1] + bD))/(sdn1*sqrt(3))) -
log(sdn1) - 0.5*log(3) -
2*log(1 + exp(-((pi *(y[1]-b1*Time[1] + bD))/(sdn1*sqrt(3)))));
L2=1;
do j=2 to 5;
L2=L2*(log(pi) - ((pi*(y[j]-b1*Time[j]+y[j-1]*fi+ bD))/(sdn1*sqrt(3*(1-fi*fi)))) -
log(sdn1) - 0.5*log(3*(1-fi*fi)) -
2*log(1+exp(-((pi*(y[j]-b1*Time[j]+y[j-1]*fi+ bD))/(sdn1*sqrt(3*(1-fi*fi))))))
);
end;
L=L1*L2;
dum=1;
/*model dum ~ general(L);*/
/*random bD ~ normal(0,sdC*sdC) subject=ID;*/
run;
I would offer a couple of final notes. I have frequently used wide data sets to construct likelihood functions where there are some small number of observations per subject. From the limited look I made at how you have constructed your likelihood function, I don't think reshaping the data in long format would benefit you.
The above code runs fine. But I employed just two records. You might identify an issue with the likelihood function for some other records that I excluded.
It is possible, too, that your optimization problems are related to using a BOUNDS statement. Using the reparameterized model may be all that is required for your likelihood function to properly execute.
Upvotes: 1