Reputation: 51
I've found this same question posted everywhere, and I can't seem to get any of the solutions to fit to my data, and I'm wondering if I'm trying to fit my data to a model that's just too complex.
I'm trying to fit my data to a multinomial logistic regression model from the MCMCglmm package. I've looked at many different documents, tutorials, and the MCMCglmm manual itself, primarily Florian Jaeger's tutorial which is excellent and thorough. I get lost, however, in his selection of the values for both the G-structure and R-structure of the prior, and I keep getting this error message
Error in priorformat(if (NOpriorG) { :
V is the wrong dimension for some prior$G/prior$R elements
In particular, I'm not sure what the n
values should be given my data in both, but the somewhat opaque error message is suggesting that it's an issue with V
Here's a (relevant) subset of my data:
CG_imm locuteur enquete loc_age loc_sexe left liquid right articulation_C1 voice_C1 NC_C1 NC_right voice_right right2 pos logfreq realization
abordable 44ajs1 Nantes 79 M bl l p stop V Vstop stop NV NVstop adj NA 2
admettre 91adb1 Brunoy 54 M tR R E stop N NVstop mid-vowel V mid-vowel verb 6.52209279 0
adorable 91aal1 Brunoy 27 F bl l break stop V Vstop break break weak break adj NA 0
agréable 92aaf2 PC 55 F bl l break stop V Vstop break break strong break adj 7.95191138 0
agréable 21abm1 Dijon 31 M bl l k stop V Vstop stop NV NVstop adj 7.95191138 1
agréable 75ccr2 Paris NA F bl l break stop V Vstop break break break adj 7.95191138 0
agréable 69ajl1 Lyon 52 M bl l break stop V Vstop break break weak break adj 7.95191138 0
Alexandre 91asl1 Brunoy 64 F dR R break stop V Vstop break break weak break noun NA 0
From this dataset, I'm trying to predict realization
a variable at three-levels with many different predictors. Here's one of the models I've attempted:
k <- length(levels(df$realization))
I <- diag(k-1)
J <- matrix(rep(1, (k-1)^2), c(k-1, k-1))
prior1<-list(
R = list(fix=1, V=0.5 * (I + J), n = 2
),
G = list(
G1 = list(V = diag(4), n = 4),
G2 = list(V = diag(8), n = 8),
G3 = list(V = diag(4), n = 4),
G4 = list(V = diag(4), n = 4),
G5 = list(V = diag(14), n = 14),
G6 = list(V = 3, n = 3),
G7 = list(V = 3, n = 3),
G8 = list(V = diag(6), n = 6)))
m <- MCMCglmm(realization ~ -1 + trait + NC_C1*liquid + right2 + loc_age + logfreq + pos,
random = ~ us(trait):CG_imm + us(NC_C1*liquid):CG_imm +
us(trait):locuteur + us(trait):enquete + us(right2):CG_imm +
us(loc_age):locuteur + us(log_freq):CG_imm + us(pos):CG_imm,
rcov = ~ us(trait):units,
prior = prior1,
burnin = 15000,
nitt = 40000,
family = "categorical",
data = df)
In my full dataset, the predictor variables I've selected as random effects have the following levels:
> length(levels(ol_north$CG_imm))
[1] 181
> length(levels(ol_north$enquete))
[1] 13
> length(levels(ol_north$locuteur))
[1] 129
All of the predictors that are fixed effects are categorical except for loc_age
and log_freq
The categorical, fixed predictors are at the following levels:
> length(levels(ol_north$NC_C1))
[1] 4
> length(levels(ol_north$liquid))
[1] 2
> length(levels(ol_north$right2))
[1] 14
> length(levels(ol_north$pos))
[1] 6
I've toyed with the n
values in both the G-structure lists and R-structure, and I've adjusted the values in the diag()
argument in the G-structure to no avail. With such a complex model, I'm not sure where exactly my errors are occurring. I've simplified the model to this, and do get it to converge but with a warning:
m <- MCMCglmm(realization ~ -1 + trait + NC_C1*liquid + right2,
random = ~ us(trait):CG_imm + us(NC_C1*liquid):CG_imm,
rcov = ~ us(trait):units,
prior = list(
R = list(fix=1, V=0.5 * (I + J), n = 2),
G = list(
G1 = list(V = diag(2), n = 22),
G2 = list(V = diag(8), n = 8))),
burnin = 15000,
nitt = 40000,
family = "categorical",
data = df)
Warning message:
In MCMCglmm(realization ~ -1 + trait + NC_C1 * liquid + right2, :
some fixed effects are not estimable and have been removed. Use singular.ok=TRUE to sample these effects, but use an informative prior!
Thank you so much for any help in advance!
Upvotes: 5
Views: 2365
Reputation: 41
unless you are expecting something specific from your data, you could try to set the priors in the following way:
priors <- list(R = list(diag(1), nu = 10^-6), G = list(G1 = list(V = 1, nu = 10^-6)))
setting such a small nu means you're setting a "flat prior", i.e. you are saying "variance could be 1, but I'm not at all sure about it", so you're basically not influencing where the Markov chain starts.
You could also check this tutorial. Good luck!
Upvotes: 0