Reputation: 363
I am using ezANOVA to implement the analysis of an experimental design having a within subject variable and a between subject variable. I successfully implemented ezANOVA as follows:
structure(list(Sub = structure(c(3L, 3L, 3L, 4L, 4L, 4L, 1L,
1L, 1L, 2L, 2L, 2L), .Label = c("A7011", "A7022", "B13", "B14"
), class = "factor"), Depvariable = c(0.375, 0.066667, 0.15,
0.275, 0.025, 0.78333, 0.24167, 0.058333, 0.14167, 0.19167, 0.5,
0), Group = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 1L), .Label = c("A", "B"), class = "factor"), WithinFactor = c(0.6,
0, -0.3, 0.6, 0, -0.3, 0.6, 0, -0.3, 0.6, 0, -0.3)), .Names = c("Sub",
"Depvariable", "Group", "WithinFactor"), row.names = c(NA, 12L
), class = "data.frame")
mod.ez<-ezANOVA(data,
dv = .(Depvariable),
wid = .(Sub), # subject
within = .(WithinFactor),
between=.(Group),
type=3,
detailed=TRUE,
return_aov=TRUE)
I am stuck with the procedure to check normal distribution of the residuals. I've tried the following:
shapiro.test(as.numeric(residuals(mod.ez$aov)))
But I get the following error
Error in shapiro.test(as.numeric(residuals(mod.ez$aov))) : sample size must be between 3 and 5000
If I call residuals(mod.ez$aov)
the result is NULL.
I alternatively used lmer where check of the residuals seem straighforward
plot(fitted(model_lmer), residuals(model_lmer))
However, as ezANOVA also has implemented sphericity's tests and corrections I would like to stick to it and find a way of check assumption re normality of the residuals.
Any help greatly appreciated
Upvotes: 6
Views: 1860
Reputation: 788
In steps:
Full example
First, a full version of your code is:
library(ez)
data <- structure(list(Sub = structure(c(3L, 3L, 3L, 4L, 4L, 4L, 1L,
1L, 1L, 2L, 2L, 2L), .Label = c("A7011", "A7022", "B13", "B14"
), class = "factor"), Depvariable = c(0.375, 0.066667, 0.15,
0.275, 0.025, 0.78333, 0.24167, 0.058333, 0.14167, 0.19167, 0.5,
0), Group = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L,
1L, 1L), .Label = c("A", "B"), class = "factor"), WithinFactor = c(0.6,
0, -0.3, 0.6, 0, -0.3, 0.6, 0, -0.3, 0.6, 0, -0.3)), .Names = c("Sub",
"Depvariable", "Group", "WithinFactor"), row.names = c(NA, 12L
), class = "data.frame")
mod.ez <- ezANOVA(
data,
dv = .(Depvariable),
wid = .(Sub), # subject
within = .(WithinFactor),
between = .(Group),
type = 3,
detailed = TRUE,
return_aov = TRUE)
How to explore complicated R structures
Second, if you can't find residuals (etc.), a question is: does the result of ezANOVA actually contain them, or has it chucked out the info? For this sort of question, I like to use this function:
wtf_is <- function(x) {
# For when you have no idea what something is.
# https://stackoverflow.com/questions/8855589
cat("1. typeof():\n")
print(typeof(x))
cat("\n2. class():\n")
print(class(x))
cat("\n3. mode():\n")
print(mode(x))
cat("\n4. names():\n")
print(names(x))
cat("\n5. slotNames():\n")
print(slotNames(x))
cat("\n6. attributes():\n")
print(attributes(x))
cat("\n7. str():\n")
print(str(x))
}
Thus:
wtf_is(mod.ez)
Hunting for residuals in ezANOVA output
The output is long. We're looking for lists of length 12 (since you have 12 data points), or things looking like residuals or predicted values. Part of the output is:
[...]
7. str():
List of 2
$ ANOVA:'data.frame': 3 obs. of 9 variables:
[...]
$ aov :List of 4
..$ (Intercept) :List of 9
[...]
..$ Sub :List of 9
[...]
.. ..$ residuals : Named num [1:3] 0.102 -0.116 0.164
.. .. ..- attr(*, "names")= chr [1:3] "2" "3" "4"
[...]
.. ..$ fitted.values: Named num [1:3] -1.39e-17 1.28e-01 9.03e-02
.. .. ..- attr(*, "names")= chr [1:3] "2" "3" "4"
..$ Sub:WithinFactor:List of 9
[...]
.. ..$ residuals : Named num [1:4] 0.00964 0.00964 0.23081 -0.23081
.. .. ..- attr(*, "names")= chr [1:4] "5" "6" "7" "8"
[...]
.. ..$ fitted.values: Named num [1:4] 0.0804 -0.0804 -0.0444 -0.0444
.. .. ..- attr(*, "names")= chr [1:4] "5" "6" "7" "8"
[...]
..$ Within :List of 6
[...]
.. ..$ residuals : num [1:4, 1] 0.3286 0.1098 -0.4969 0.0564
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:4] "9" "10" "11" "12"
.. .. .. ..$ : NULL
.. ..$ fitted.values: num [1:4, 1] 0 0 0 0
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:4] "9" "10" "11" "12"
.. .. .. ..$ : NULL
[...]
..- attr(*, "error.qr")=List of 5
.. ..$ qr : num [1:12, 1:8] -3.464 0.289 0.289 0.289 0.289 ...
.. .. ..- attr(*, "dimnames")=List of 2
.. .. .. ..$ : chr [1:12] "1" "2" "3" "4" ...
.. .. .. ..$ : chr [1:8] "(Intercept)" "Sub1" "Sub2" "Sub3" ...
.. .. ..- attr(*, "assign")= int [1:8] 0 1 1 1 2 2 2 2
.. .. ..- attr(*, "contrasts")=List of 1
.. .. .. ..$ Sub: chr "contr.helmert"
[...]
... none of which looks very helpful to me. So the answer is probably "it's not there", or "not obviously there", and others agree: ggplot2 residuals with ezANOVA
Using afex::aov_ez instead
So you could instead use:
library(afex)
model2 <- aov_ez(
id = "Sub", # subject
dv = "Depvariable",
data = data,
between = c("Group"),
within = c("WithinFactor"),
type = "III" # or 3; type III sums of squares
)
anova(model2)
summary(model2)
residuals(model2$lm)
... and that does give you residuals.
However, it also gives different F
/p
values.
Noting why aov_ez and ezANOVA are giving different answers here
We have:
> mod.ez
$ANOVA
Effect DFn DFd SSn SSd F p p<.05 ges
1 Group 1 2 0.024449088 0.05070517 0.96436277 0.4296328 0.134418588
2 WithinFactor 1 2 0.001296481 0.10673345 0.02429382 0.8904503 0.008167579
3 Group:WithinFactor 1 2 0.015557350 0.10673345 0.29151781 0.6433264 0.089928978
> anova(model2)
Anova Table (Type III tests)
Response: Depvariable
num Df den Df MSE F ges Pr(>F)
Group 1.0000 2.0000 0.025353 0.9644 0.07197 0.4296
WithinFactor 1.4681 2.9363 0.090093 0.2322 0.08876 0.7471
Group:WithinFactor 1.4681 2.9363 0.090093 1.5001 0.38628 0.3370
Different results. Notice the warning message from mod.ez:
Warning: "WithinFactor" will be treated as numeric
... i.e. as a continuous predictor (covariate), not a discrete predictor (factor). So we should look at the covariate
and factorize
arguments; see ?aov_ez
. I must say that I struggled a bit to work out how to do a within-subjects ANCOVA here. The factorize
part applies only to between-subjects predictors, if I read the docs correctly, and similarly covariate
is for between-subjects covariates only.
As a quick check, if you use ezANOVA and force it to use WithinFactor as a discrete (not continuous) predictor, like this:
data$WithinCovariate <- data$WithinFactor # so the name is clearer!
data$WithinFactorDiscrete <- as.factor(data$WithinFactor)
mod.ez.discrete <- ezANOVA(
data,
dv = .(Depvariable),
wid = .(Sub), # subject
within = .(WithinFactorDiscrete),
between = .(Group),
type = 3,
detailed = TRUE,
return_aov = TRUE)
... you get F
/p
values that match aov_ez
:
> mod.ez.discrete
$ANOVA
Effect DFn DFd SSn SSd F p p<.05 ges
1 (Intercept) 1 2 0.65723113 0.05070517 25.9236350 0.03647725 * 0.67583504
2 Group 1 2 0.02444909 0.05070517 0.9643628 0.42963280 0.07197457
3 WithinFactorDiscrete 2 4 0.03070651 0.26453641 0.2321534 0.80280844 0.08876045
4 Group:WithinFactorDiscrete 2 4 0.19841198 0.26453641 1.5000731 0.32651697 0.38627588
So that gets you matching results, and Greenhouse-Geisser/Huynh-Feldt corrections, and residuals, for everything except within-subjects covariates.
Finally...
What does it mean to check sphericity with a continuous within-subjects predictor? I am entirely unclear on that; sphericity relates to homogeneity of variance of the differences between pairs of values at various levels of the within-subjects factor(s). If the predictor is continuous, there are no pairs.
So at the risk of being wrong, I would either (a) trust ezANOVA and forgo residuals; (b) use something that can do everything except the sphericity tests, like this:
library(lme4)
library(lmerTest) # upgrades reports from lme4 to include p values! ;)
mod.lmer.wscov_interact <- lmer(
Depvariable ~
Group * WithinCovariate
+ (1 | Sub),
data = data
)
anova(mod.lmer.wscov_interact)
residuals(mod.lmer.wscov_interact)
mod.lmer.wscov_no_interact <- lmer(
Depvariable ~
Group + WithinCovariate
+ (1 | Sub),
data = data
)
anova(mod.lmer.wscov_no_interact)
mod.lmer.wsfac <- lmer(
Depvariable ~
Group * WithinFactorDiscrete
+ (1 | Sub),
data = data
)
anova(mod.lmer.wsfac)
giving
> anova(mod.lmer.wscov_interact)
Analysis of Variance Table of type III with Satterthwaite
approximation for degrees of freedom
Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
Group 0.033586 0.033586 1 8 0.50936 0.4957
WithinCovariate 0.001296 0.001296 1 8 0.01966 0.8920
Group:WithinCovariate 0.015557 0.015557 1 8 0.23594 0.6402
> residuals(mod.lmer.wscov_interact)
1 2 3 4 5 6 7 8 9 10 11 12
0.130059250 -0.219344250 -0.156546500 0.030059250 -0.261011250 0.476783500 -0.009225679 -0.118156464 0.002383643 -0.059225679 0.323510536 -0.139286357
> anova(mod.lmer.wscov_no_interact)
Analysis of Variance Table of type III with Satterthwaite
approximation for degrees of freedom
Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
Group 0.0244491 0.0244491 1 9 0.40519 0.5403
WithinCovariate 0.0012965 0.0012965 1 9 0.02149 0.8867
> anova(mod.lmer.wsfac)
Analysis of Variance Table of type III with Satterthwaite
approximation for degrees of freedom
Sum Sq Mean Sq NumDF DenDF F.value Pr(>F)
Group 0.024449 0.024449 1 6 0.46534 0.5206
WithinFactorDiscrete 0.030707 0.015353 2 6 0.29222 0.7567
Group:WithinFactorDiscrete 0.198412 0.099206 2 6 1.88819 0.2312
Upvotes: 6