I'm new to JAGS, and I'm trying to run a simple logistic regression. My data file is very simple: the response is binary and the one predictor I'm using has three levels. Like this:
col1: 1 2 2 2 1 1 1 2 1 2 ...
col2: HLL, HLL, LHL, LLL, LHL, HLL ...
The levels in col2
are HLL, LHL, LLL
. I dummy coded it and created a data frame that looks like this:
(intercept) HLL LHL LLL
1 1 0 0 1
2 1 0 0 1
4 1 0 0 1
5 1 0 1 0
6 1 0 1 0
7 1 0 0 1
My data file (myList
), then, looks like this:
List of 5
$ y : num [1:107881] 2 2 2 2 2 2 2 2 2 2 ...
$ N : num 500
$ HLL: num [1:107881] 0 0 0 0 0 0 0 0 0 0 ...
$ LHL: num [1:107881] 0 0 0 1 1 0 0 0 0 1 ...
$ LLL: num [1:107881] 1 1 1 0 0 1 1 1 1 0 ...
I'm using N=500
because the full data frame is huge and I just want to test it.
cat(
"model {
for( i in 1 : N ){
y[i] ~ dbern(mu[i])
mu[i] <- 1/(1+exp(-(a + b*HLL[i] + c*LHL[i] + d*LLL[i])))
}
a ~ dnorm(0, 1.0e-12)
b ~ dnorm(0, 1.0e-12)
c ~ dnorm(0, 1.0e-12)
d ~ dnorm(0, 1.0e-12)
}", file = "model.txt"
)
model = jags.model(file = "model.txt",
data = myList,
n.chains = 3, n.adapt = 500)
Error in jags.model(file = "model.txt", data = antPenList, n.chains = 3, :
Error in node y[1]
Node inconsistent with parents
The dbern distribution expects response in {0,1} rather than {1,2} as it seems you have coded it, so you need to subtract 1 from your values of y.
It is a bit strange that you get this error, as dbern does not usually give an error for other response values (it basically makes <0 = 0 and >1 = 1). The error is probably stemming from the fact that the response is fitting all the same value, but if that doesn't fix it then you could try the following:
1) Try increasing the precision of your priors for a/b/c/d slightly - a variance of 10^12 is quite a lot
2) Instead of:
mu[i] <- 1/(1+exp(-(a + b*HLL[i] + c*LHL[i] + d*LLL[i])))
You could write:
logit(mu[i]) <- -(a + b*HLL[i] + c*LHL[i] + d*LLL[i])
This might also help JAGS to recognise this as a GLM and initiate the appropriate samplers - remember to load the glm module.
3) Set some initial values for a/b/c/d that are vaguely consistent with your data (perhaps obtained using a fit with glm() in R)