grifjo06
grifjo06

Reputation: 51

Defining the Prior in MCMCglmm

I've found this same question posted everywhere, and I can't seem to get any of the solutions to fit to my data, and I'm wondering if I'm trying to fit my data to a model that's just too complex.

I'm trying to fit my data to a multinomial logistic regression model from the MCMCglmm package. I've looked at many different documents, tutorials, and the MCMCglmm manual itself, primarily Florian Jaeger's tutorial which is excellent and thorough. I get lost, however, in his selection of the values for both the G-structure and R-structure of the prior, and I keep getting this error message

Error in priorformat(if (NOpriorG) { : 
  V is the wrong dimension for some prior$G/prior$R elements

In particular, I'm not sure what the n values should be given my data in both, but the somewhat opaque error message is suggesting that it's an issue with V

Here's a (relevant) subset of my data:

CG_imm       locuteur   enquete       loc_age   loc_sexe      left    liquid   right    articulation_C1    voice_C1   NC_C1    NC_right    voice_right    right2          pos    logfreq      realization
abordable    44ajs1     Nantes        79         M            bl      l        p        stop               V          Vstop    stop        NV             NVstop          adj    NA           2
admettre     91adb1     Brunoy        54         M            tR      R        E        stop               N          NVstop   mid-vowel   V              mid-vowel       verb   6.52209279   0
adorable     91aal1     Brunoy        27         F            bl      l        break    stop               V          Vstop    break       break          weak break      adj    NA           0
agréable     92aaf2     PC            55         F            bl      l        break    stop               V          Vstop    break       break          strong break    adj    7.95191138   0
agréable     21abm1     Dijon         31         M            bl      l        k        stop               V          Vstop    stop        NV             NVstop          adj    7.95191138   1
agréable     75ccr2     Paris         NA         F            bl      l        break    stop               V          Vstop    break       break          break           adj    7.95191138   0
agréable     69ajl1     Lyon          52         M            bl      l        break    stop               V          Vstop    break       break          weak break      adj    7.95191138   0
Alexandre    91asl1     Brunoy        64         F            dR      R        break    stop               V          Vstop    break       break          weak break      noun   NA           0

From this dataset, I'm trying to predict realization a variable at three-levels with many different predictors. Here's one of the models I've attempted:

k <- length(levels(df$realization))
I <- diag(k-1)
J <- matrix(rep(1, (k-1)^2), c(k-1, k-1))

prior1<-list(
  R = list(fix=1, V=0.5 * (I + J), n = 2
  ),
  G = list(
    G1 = list(V = diag(4), n = 4),
    G2 = list(V = diag(8), n = 8),
    G3 = list(V = diag(4), n = 4),
    G4 = list(V = diag(4), n = 4),
    G5 = list(V = diag(14), n = 14),
    G6 = list(V = 3, n = 3),
    G7 = list(V = 3, n = 3),
    G8 = list(V = diag(6), n = 6)))

m <- MCMCglmm(realization ~ -1 + trait + NC_C1*liquid + right2 + loc_age + logfreq + pos,
              random = ~ us(trait):CG_imm + us(NC_C1*liquid):CG_imm +
              us(trait):locuteur  + us(trait):enquete + us(right2):CG_imm +
              us(loc_age):locuteur + us(log_freq):CG_imm + us(pos):CG_imm,
              rcov = ~ us(trait):units,
              prior = prior1,
              burnin = 15000,
              nitt = 40000,
              family = "categorical",
              data = df)

In my full dataset, the predictor variables I've selected as random effects have the following levels:

> length(levels(ol_north$CG_imm))
[1] 181
> length(levels(ol_north$enquete))
[1] 13
> length(levels(ol_north$locuteur))
[1] 129

All of the predictors that are fixed effects are categorical except for loc_age and log_freq The categorical, fixed predictors are at the following levels:

> length(levels(ol_north$NC_C1))
[1] 4
> length(levels(ol_north$liquid))
[1] 2
> length(levels(ol_north$right2))
[1] 14
> length(levels(ol_north$pos))
[1] 6

I've toyed with the n values in both the G-structure lists and R-structure, and I've adjusted the values in the diag() argument in the G-structure to no avail. With such a complex model, I'm not sure where exactly my errors are occurring. I've simplified the model to this, and do get it to converge but with a warning:


m <- MCMCglmm(realization ~ -1 + trait + NC_C1*liquid + right2,
              random = ~ us(trait):CG_imm + us(NC_C1*liquid):CG_imm,
              rcov = ~ us(trait):units,
              prior = list(
                  R = list(fix=1, V=0.5 * (I + J), n = 2),
                  G = list(
                      G1 = list(V = diag(2), n = 22),
                      G2 = list(V = diag(8), n = 8))),
              burnin = 15000,
              nitt = 40000,
              family = "categorical",
              data = df)

Warning message:
In MCMCglmm(realization ~ -1 + trait + NC_C1 * liquid + right2,  :
  some fixed effects are not estimable and have been removed. Use singular.ok=TRUE to sample these effects, but use an informative prior!

Thank you so much for any help in advance!

Upvotes: 5

Views: 2365

Answers (1)

elena_m
elena_m

Reputation: 41

unless you are expecting something specific from your data, you could try to set the priors in the following way:

priors <- list(R = list(diag(1), nu = 10^-6), G = list(G1 = list(V = 1, nu = 10^-6))) 

setting such a small nu means you're setting a "flat prior", i.e. you are saying "variance could be 1, but I'm not at all sure about it", so you're basically not influencing where the Markov chain starts.

You could also check this tutorial. Good luck!

Upvotes: 0

Related Questions