cgxytf
cgxytf

Reputation: 431

Error in post hoc test for lmer(): both multcomp() and emmeans()

I have a dataset of measurements of "Y" at different locations, and I am trying to determine how variable Y is influenced by variables A, B, and D by running a lmer() model and analyzing the results. However, when I reach the post hoc step, I receive an error when trying to analyze.

Here is an example of my data:

table <- "   ID location A         B      C     D       Y
1   1       AA 0 0.6181587 -29.67 14.14 168.041
2   2       AA 1 0.5816176 -29.42 14.21 200.991
3   3       AA 2 0.4289670 -28.57 13.55 200.343
4   4       AA 3 0.4158891 -28.59 12.68 215.638
5   5       AA 4 0.3172721 -28.74 12.28 173.299
6   6       AA 5 0.1540603 -27.86 14.01 104.246
7   7       AA 6 0.1219355 -27.18 14.43 128.141
8   8       AA 7 0.1016643 -26.86 13.75 179.330
9   9       BB 0 0.6831649 -28.93 17.03 210.066
10 10       BB 1 0.6796935 -28.54 18.31 280.249
11 11       BB 2 0.5497743 -27.88 17.33 134.023
12 12       BB 3 0.3631052 -27.48 16.79 142.383
13 13       BB 4 0.3875498 -26.98 17.81 136.647
14 14       BB 5 0.3883785 -26.71 17.56 142.179
15 15       BB 6 0.4058061 -26.72 17.71 109.826
16 16       CC 0 0.8647298 -28.53 11.93 220.464
17 17       CC 1 0.8664036 -28.39 11.59 326.868
18 18       CC 2 0.7480748 -27.61 11.75 322.745
19 19       CC 3 0.5959143 -26.81 13.27 170.064
20 20       CC 4 0.4849077 -26.77 14.68 118.092
21 21       CC 5 0.3584687 -26.65 15.65  95.512
22 22       CC 6 0.3018285 -26.33 16.11  71.717
23 23       CC 7 0.2629121 -26.39 16.16  60.052
24 24       DD 0 0.8673077 -27.93 12.09 234.244
25 25       DD 1 0.8226558 -27.96 12.13 244.903
26 26       DD 2 0.7826429 -27.44 12.38 252.485
27 27       DD 3 0.6620447 -27.23 13.84 150.886
28 28       DD 4 0.4453213 -27.03 15.73 102.787
29 29       DD 5 0.3720257 -27.13 16.27 109.201
30 30       DD 6 0.6040217 -27.79 16.41 101.509
31 31       EE 0 0.8770987 -28.62 12.72 239.036
32 32       EE 1 0.8504547 -28.47 12.92 220.600
33 33       EE 2 0.8329484 -28.45 12.94 174.979
34 34       EE 3 0.8181102 -28.37 13.17 138.412
35 35       EE 4 0.7942685 -28.32 13.69 121.330
36 36       EE 5 0.7319724 -28.22 14.62 111.851
37 37       EE 6 0.7014828 -28.24 15.04 110.447
38 38       EE 7 0.7286984 -28.15 15.18 121.831"

#Create a dataframe with the above table
df <- read.table(text=table, header = TRUE)
df

# Make sure location is a factor
df$location<-as.factor(df$location)

Here is my model:

# Load libraries
library(ggplot2)
library(pscl)
library(lmtest)
library(lme4)
library(car)

mod = lmer(Y ~ A * B * poly(D, 2) * (1|location), data = df)
summary(mod)
plot(mod)

I now need to determine what variables significantly influence Y. Thus I ran Anova() from the package car (output pasted here).

Anova(mod)
# Analysis of Deviance Table (Type II Wald chisquare tests)
# 
# Response: Y
# Chisq Df Pr(>Chisq)    
# A                 8.2754  1   0.004019 ** 
# B                 0.0053  1   0.941974    
# poly(D, 2)        40.4618  2  1.636e-09 ***
# A:B               0.1709  1   0.679348    
# A:poly(D, 2)      1.6460  2   0.439117    
# B:poly(D, 2)      5.2601  2   0.072076 .  
# A:B:poly(D, 2)    0.6372  2   0.727175    
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

This suggests that:

A significantly influences Y

B does not significantly influence Y

D significantly influences Y

So next I would run a post hoc test for each of these variables, but this is where I run into issues. I have tried using both emmeans and multcomp packages below:

library(emmeans)
emmeans(mod, list(pairwise ~ A), adjust = "tukey")
# NOTE: Results may be misleading due to involvement in interactions
# Error in if ((misc$estType == "pairs") && (paste(c("", by), collapse = ",") !=  : 
#  missing value where TRUE/FALSE needed

pairs(emmeans(mod, "A"))
# NOTE: Results may be misleading due to involvement in interactions
# Error in if ((misc$estType == "pairs") && (paste(c("", by), collapse = ",") !=  : 
#  missing value where TRUE/FALSE needed

library(multcomp)
summary(glht(mod, linfct = mcp(A = "Tukey")), test = adjusted("fdr"))
# Error in h(simpleError(msg, call)) : 
#  error in evaluating the argument 'object' in selecting a method for function 'summary': Variable(s) ‘depth’ of class ‘integer’ is/are not contained as a factor in ‘model’.

This is the first time I've run an ANOVA/post hoc test on a lmer() model, and though I've read a few introductory sites for this model, I'm not sure I am testing it correctly. Any help would be appreciated.

Upvotes: 0

Views: 1048

Answers (1)

Russ Lenth
Russ Lenth

Reputation: 6770

If I am looking at the data correctly, A is the variable that has values of 0, 1, ..., 7. Now look at your anova table, where you see that A has only 1 d.f., not 7 as it should for a factor having 8 levels. That means your model is taking A to be a numerical predictor -- which is rather meaningless. Make A into a factor and re-fit he model. You'll have better luck.

I also think you meant to have + (1|location) at the end of the model formula, rather than having the random effects interacting with some of the polynomial effects.

Upvotes: 3

Related Questions