Reputation: 13
New stan user here. This particular model (basically a mixed effects logistic regression) will run sometimes, but will often get errors "The following variables have undefined values: log_lik[182]" etc. It is always a problem with the "dev" or "log_lik" values. The index it gets caught on is sometimes at the transition between areas, but also in random places in some runs.
stan model:
data{
int nObs;
int S[nObs];
int<lower=0> n[nObs];
real Area2[nObs];
real Area3[nObs];
real Julian_Day[nObs];
int Year[nObs];
int nYears;
}
parameters{
real intercept_raw;
real beta_Area2_raw;
real beta_Area3_raw;
real gamm_raw;
real gamm_raw_Area2;
real gamm_raw_Area3;
real vary_Year[nYears];
real<lower=0> sigma_Year;
}
transformed parameters {
real intercept;
real beta_Area2;
real beta_Area3;
real gamm;
real gamm_Area2;
real gamm_Area3;
intercept <- intercept_raw*20;
beta_Area2 <- beta_Area2_raw*5;
beta_Area3 <- beta_Area3_raw*5;
gamm <- gamm_raw*5;
gamm_Area2 <- gamm_raw_Area2*5;
gamm_Area3 <- gamm_raw_Area3*5;
}
model{
real vary[nObs];
real y[nObs];
// Priors
intercept_raw ~ normal(0,1);
beta_Area2_raw ~ normal( 0 , 1 );
beta_Area3_raw ~ normal( 0 , 1 );
gamm_raw ~ normal( 0 , 1 );
gamm_raw_Area2 ~ normal( 0 , 1 );
gamm_raw_Area3 ~ normal( 0 , 1 );
sigma_Year ~ cauchy( 0 , 5 );
// random effect
for ( j in 1:nYears ) vary_Year[j] ~ normal( 0 , sigma_Year );
// Fixed effects
for ( i in 1:nObs ) {
vary[i] <- vary_Year[Year[i]];
y[i] <- vary[i] + intercept
+ beta_Area2 * Area2[i]
+ beta_Area3 * Area3[i]
+ gamm * Julian_Day[i]
+ gamm_Area2 * Area2[i] * Julian_Day[i]
+ gamm_Area3 * Area3[i] * Julian_Day[i];
}
S ~ binomial_logit( n, y );
}
generated quantities{
real y_pred[nObs];
real dev;
real y[nObs];
real vary[nObs];
vector[nObs] log_lik;
dev <- 0;
for ( i in 1:nObs ) {
vary[i] <- vary_Year[Year[i]];
y[i] <- vary[i] + intercept
+ beta_Area2 * Area2[i]
+ beta_Area3 * Area3[i]
+ gamm * Julian_Day[i]
+ gamm_Area2 * Area2[i] * Julian_Day[i]
+ gamm_Area3 * Area3[i] * Julian_Day[i];
log_lik[i] <- binomial_log( S[i] , n[i] , inv_logit(y[i]));
dev <- dev + (-2) * log_lik[i];
y_pred[i] <- binomial_rng(100, inv_logit(y[i]) );
}
}
The data looks like this (dataframe "SDF"):
Year Area.ID DayIndex S n Area1 Area2 Area3
1 1 1 19 1 1 1 0 0
2 1 1 22 0 2 1 0 0
3 1 1 23 2 5 1 0 0
4 1 1 24 1 3 1 0 0
5 1 1 26 3 3 1 0 0
6 1 1 28 1 3 1 0 0
and these calls are used in R:
Dlist <- list ("nObs"=dim(SDF)[1], "S"=SDF$S, "n"=SDF$n,
"Area2"= SDF$Area2,"Area3"= SDF$Area3, "Julian_Day"=SDF$DayIndex,
"Year"=SDF$Year,"nYears"=length(unique(SDF$Year)))
# Fit intercept model using stan
fit_ints <- stan(file='STAN/Logistic_Diff_Slope_SN.stan',data = Dlist, iter=5000, chains=3)
Upvotes: 1
Views: 1072
Reputation: 4990
This error message occurs when some generated quantity evaluates to NaN
, usually due to numerical underflow or overflow.
In your case, you can probably avoid it by using the more numerically stable binomial_logit_log
function rather than the binomial_log
function (for the same reasons you are using binomial_logit
in the model
block rather than binomial
). In other words, it should be
log_lik[i] <- binomial_logit_log( S[i] , n[i] , y[i]);
rather than
log_lik[i] <- binomial_log( S[i] , n[i] , inv_logit(y[i]));
In addition, when drawing from the posterior predictive distribution, you could do something like
p <- inv_logit(y[i]);
if (is_nan(p)) y_pred[i] <- y[i] > 0;
else if (p >= 1) y_pred[i] <- 1;
else if (p <= 0) y_pred[i] <- 0;
else y_pred[i] <- binomial_rng(100, inv_logit(y[i]));
Unfortunately, there is no binomial_logit_rng
function in Stan at the moment.
Upvotes: 4