Reputation: 179
I'm new to JAGS and I'm trying to predict a binary outcome (0/1) using 9 non-continuous predictors. Predictor values may be 0, 1 or 2. This is my first time doing this, and even though I can get the model to run, I am 100% sure there's definitely a number of issues here.
$y
[1] 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0
[29] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0
$N
[1] 50
$oAnt
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1
[29] 1 1 1 1 1 1 2 1 1 0 1 1 1 1 1 2 1 1 1 1 1 1
$nAnt
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[29] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
$cAnt
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0
[29] 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0
$oPen
[1] 1 1 1 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 0 1 1 1 1 1 1 0
[29] 1 0 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 0 2 1 1
$nPen
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 2 1
[29] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1
$cPen
[1] 0 1 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0
[29] 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0
$oFin
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1
[29] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
$nFin
[1] 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1 1
[29] 1 1 1 1 2 1 1 1 1 2 1 1 1 2 1 1 1 1 1 1 1 3
$cFin
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1
[29] 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0
model {
for( i in 1 : N ){
y[i] ~ dbern(mu[i])
mu[i] <- 1/(1+exp(-(b0 + b1*oAnt[i] + b2*nAnt[i] + b3*cAnt[i] + b4*oPen[i] + b5*nPen[i] + b6*cPen[i] + b7*oFin[i] + b8*nFin[i] + b9*cFin[i])))
}
b0 ~ dnorm(0, 1.0e-12)
b1 ~ dnorm(0, 1.0e-12)
b2 ~ dnorm(0, 1.0e-12)
b3 ~ dnorm(0, 1.0e-12)
b4 ~ dnorm(0, 1.0e-12)
b5 ~ dnorm(0, 1.0e-12)
b6 ~ dnorm(0, 1.0e-12)
b7 ~ dnorm(0, 1.0e-12)
b8 ~ dnorm(0, 1.0e-12)
b9 ~ dnorm(0, 1.0e-12)
}
I used estimates from a glm()
model as inits (as suggest by A. Gelman)—but for the sake of simplicity, let's just assume I'll let JAGS pick the initial values for the chains.
jagsModel = jags.model(file = "antPenFin.txt", data = dataList, n.chains = 2, n.adapt = 500)
update(jagsModel, n.iter = 500)
codaSamples = coda.samples(jagsModel,
variable.names = names(dataList)[3:11], n.iter = 5000)
The output of my model looks totally off (which becomes clear when I try to plot it). I'm sure there's some very basic issue here. Could somebody help out?
Thanks a lot.
Upvotes: 1
Views: 3016
Reputation: 3055
Your issue here is that you cannot supply a range of numbers (from 0 to 3 in your case) as categorical covariates. Currently, your model is interpreting those numbers as continuous. What you need to do is convert these to dummy variables. You can do this easily using the model.matrix
function in R. I'm going to generate some data as an example, which you could then apply to your data.
# generate y
y <- sample(0:1, 30, replace = TRUE)
# add three different categorical covariates. Note here that these are all
# factors.
oAnt <- factor(sample(0:2, 30, replace = TRUE))
cAnt <- factor(sample(0:1, 30, replace = TRUE))
nFin <- factor(sample(0:3, 30, replace = TRUE))
# create your model matrix
my_matrix <- model.matrix(y ~ oAnt + cAnt + nFin)
head(my_matrix)
(Intercept) oAnt1 oAnt2 cAnt1 nFin1 nFin2 nFin3
1 1 1 0 1 0 1 0
2 1 1 0 1 1 0 0
3 1 1 0 0 0 0 1
4 1 1 0 1 0 0 1
5 1 0 0 1 0 1 0
6 1 1 0 1 0 0 1
For each level in a factor for a predictor, you are going to need to create n-1 columns (e.g. nFin ranges from 0-3, you will create 3 columns of dummy variables). Thus, you are going to have many more regresison coefficients to include in your model. Once you make your model matrix you can convert it to a list as such.
# remove the intercept from the list as you already have it in your model
matrix_list <- as.list(data.frame(my_matrix[,-1]))
From there, all you need to do is create some more predictors for model. Also, if nAnt
in your model is truly all ones, go ahead and remove that as well as you are basically just including two intercepts.
Upvotes: 0