stan error "undefined values" while fitting logistic model

Question

New stan user here. This particular model (basically a mixed effects logistic regression) will run sometimes, but will often get errors "The following variables have undefined values: log_lik[182]" etc. It is always a problem with the "dev" or "log_lik" values. The index it gets caught on is sometimes at the transition between areas, but also in random places in some runs.

stan model:

data{
    int nObs;
    int S[nObs];
    int n[nObs];
    real Area2[nObs];
    real Area3[nObs];
    real Julian_Day[nObs];
    int Year[nObs];
    int nYears;
}

parameters{
    real intercept_raw;
    real beta_Area2_raw;
    real beta_Area3_raw;
    real gamm_raw;
    real gamm_raw_Area2;
    real gamm_raw_Area3;
    real vary_Year[nYears];
    real sigma_Year;
}

transformed parameters {
    real intercept;
    real beta_Area2;
    real beta_Area3;
    real gamm;
    real gamm_Area2;
    real gamm_Area3;
    intercept <- intercept_raw*20;
    beta_Area2 <- beta_Area2_raw*5;
    beta_Area3 <- beta_Area3_raw*5;
    gamm <- gamm_raw*5;
    gamm_Area2 <- gamm_raw_Area2*5;
    gamm_Area3 <- gamm_raw_Area3*5;
}

model{
    real vary[nObs];
    real y[nObs];
    // Priors
    intercept_raw ~ normal(0,1);
    beta_Area2_raw ~ normal( 0 , 1 );
    beta_Area3_raw ~ normal( 0 , 1 );
    gamm_raw ~ normal( 0 , 1 );
    gamm_raw_Area2 ~ normal( 0 , 1 );
    gamm_raw_Area3 ~ normal( 0 , 1 );
    sigma_Year ~ cauchy( 0 , 5 );
    // random effect
    for ( j in 1:nYears ) vary_Year[j] ~ normal( 0 , sigma_Year );
    // Fixed effects
    for ( i in 1:nObs ) {
        vary[i] <- vary_Year[Year[i]];
        y[i] <- vary[i] + intercept
                + beta_Area2 * Area2[i]
                + beta_Area3 * Area3[i]
                + gamm * Julian_Day[i]
                + gamm_Area2 * Area2[i] * Julian_Day[i]
                + gamm_Area3 * Area3[i] * Julian_Day[i];
    }
     S ~ binomial_logit( n, y );
}

generated quantities{
  real y_pred[nObs];
  real dev;
  real y[nObs];
  real vary[nObs];
  vector[nObs] log_lik;
  dev <- 0;
    for ( i in 1:nObs ) {
       vary[i] <- vary_Year[Year[i]];
       y[i] <- vary[i] + intercept
                + beta_Area2 * Area2[i]
                + beta_Area3 * Area3[i]
                + gamm * Julian_Day[i]
                + gamm_Area2 * Area2[i] * Julian_Day[i]
                + gamm_Area3 * Area3[i] * Julian_Day[i];
        log_lik[i] <- binomial_log( S[i] , n[i] , inv_logit(y[i]));       
        dev <- dev + (-2) * log_lik[i];
        y_pred[i] <- binomial_rng(100, inv_logit(y[i]) );
    }
}

The data looks like this (dataframe "SDF"):

 Year Area.ID DayIndex S n Area1 Area2 Area3
1    1       1       19 1 1     1     0     0
2    1       1       22 0 2     1     0     0
3    1       1       23 2 5     1     0     0
4    1       1       24 1 3     1     0     0
5    1       1       26 3 3     1     0     0
6    1       1       28 1 3     1     0     0

and these calls are used in R:

Dlist <- list ("nObs"=dim(SDF)[1], "S"=SDF$S,  "n"=SDF$n,   
  "Area2"= SDF$Area2,"Area3"= SDF$Area3,  "Julian_Day"=SDF$DayIndex,    
   "Year"=SDF$Year,"nYears"=length(unique(SDF$Year)))

# Fit intercept model using stan
fit_ints <- stan(file='STAN/Logistic_Diff_Slope_SN.stan',data = Dlist, iter=5000, chains=3)

Ben Goodrich · Accepted Answer

This error message occurs when some generated quantity evaluates to NaN, usually due to numerical underflow or overflow.

In your case, you can probably avoid it by using the more numerically stable binomial_logit_log function rather than the binomial_log function (for the same reasons you are using binomial_logit in the model block rather than binomial). In other words, it should be log_lik[i] <- binomial_logit_log( S[i] , n[i] , y[i]); rather than log_lik[i] <- binomial_log( S[i] , n[i] , inv_logit(y[i])); In addition, when drawing from the posterior predictive distribution, you could do something like p <- inv_logit(y[i]); if (is_nan(p)) y_pred[i] <- y[i] > 0; else if (p >= 1) y_pred[i] <- 1; else if (p <= 0) y_pred[i] <- 0; else y_pred[i] <- binomial_rng(100, inv_logit(y[i])); Unfortunately, there is no binomial_logit_rng function in Stan at the moment.

stan error "undefined values" while fitting logistic model

Answers (1)

Related Questions

stan error &quot;undefined values&quot; while fitting logistic model

Answers (1)

Related Questions

stan error "undefined values" while fitting logistic model