ezANOVA R check error normally distributed

Question

I am using ezANOVA to implement the analysis of an experimental design having a within subject variable and a between subject variable. I successfully implemented ezANOVA as follows:

structure(list(Sub = structure(c(3L, 3L, 3L, 4L, 4L, 4L, 1L, 
1L, 1L, 2L, 2L, 2L), .Label = c("A7011", "A7022", "B13", "B14"
), class = "factor"), Depvariable = c(0.375, 0.066667, 0.15, 
0.275, 0.025, 0.78333, 0.24167, 0.058333, 0.14167, 0.19167, 0.5, 
0), Group = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = c("A", "B"), class = "factor"), WithinFactor = c(0.6, 
0, -0.3, 0.6, 0, -0.3, 0.6, 0, -0.3, 0.6, 0, -0.3)), .Names = c("Sub", 
"Depvariable", "Group", "WithinFactor"), row.names = c(NA, 12L
 ), class = "data.frame")


mod.ez<-ezANOVA(data,
          dv = .(Depvariable),
          wid = .(Sub),  # subject
          within = .(WithinFactor),  
          between=.(Group),
          type=3, 
          detailed=TRUE,
          return_aov=TRUE)

I am stuck with the procedure to check normal distribution of the residuals. I've tried the following:

shapiro.test(as.numeric(residuals(mod.ez$aov)))

But I get the following error

Error in shapiro.test(as.numeric(residuals(mod.ez$aov))) : sample size must be between 3 and 5000

If I call residuals(mod.ez$aov) the result is NULL.

I alternatively used lmer where check of the residuals seem straighforward

plot(fitted(model_lmer), residuals(model_lmer))

However, as ezANOVA also has implemented sphericity's tests and corrections I would like to stick to it and find a way of check assumption re normality of the residuals.

Any help greatly appreciated

Rudolf Cardinal · Accepted Answer

In steps:

Full example

First, a full version of your code is:

library(ez)

data <- structure(list(Sub = structure(c(3L, 3L, 3L, 4L, 4L, 4L, 1L, 
1L, 1L, 2L, 2L, 2L), .Label = c("A7011", "A7022", "B13", "B14"
), class = "factor"), Depvariable = c(0.375, 0.066667, 0.15, 
0.275, 0.025, 0.78333, 0.24167, 0.058333, 0.14167, 0.19167, 0.5, 
0), Group = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = c("A", "B"), class = "factor"), WithinFactor = c(0.6, 
0, -0.3, 0.6, 0, -0.3, 0.6, 0, -0.3, 0.6, 0, -0.3)), .Names = c("Sub", 
"Depvariable", "Group", "WithinFactor"), row.names = c(NA, 12L
 ), class = "data.frame")

mod.ez <- ezANOVA(
    data,
    dv = .(Depvariable),
    wid = .(Sub),  # subject
    within = .(WithinFactor),  
    between = .(Group),
    type = 3, 
    detailed = TRUE,
    return_aov = TRUE)

How to explore complicated R structures

Second, if you can't find residuals (etc.), a question is: does the result of ezANOVA actually contain them, or has it chucked out the info? For this sort of question, I like to use this function:

wtf_is <- function(x) {
    # For when you have no idea what something is.
    # https://stackoverflow.com/questions/8855589
    cat("1. typeof():
")
    print(typeof(x))
    cat("
2. class():
")
    print(class(x))
    cat("
3. mode():
")
    print(mode(x))
    cat("
4. names():
")
    print(names(x))
    cat("
5. slotNames():
")
    print(slotNames(x))
    cat("
6. attributes():
")
    print(attributes(x))
    cat("
7. str():
")
    print(str(x))
}

Thus:

wtf_is(mod.ez)

Hunting for residuals in ezANOVA output

The output is long. We're looking for lists of length 12 (since you have 12 data points), or things looking like residuals or predicted values. Part of the output is:

[...]
7. str():
List of 2
 $ ANOVA:'data.frame':  3 obs. of  9 variables:
 [...]
 $ aov  :List of 4
  ..$ (Intercept)     :List of 9
  [...]
  ..$ Sub             :List of 9
  [...]
  .. ..$ residuals    : Named num [1:3] 0.102 -0.116 0.164
  .. .. ..- attr(*, "names")= chr [1:3] "2" "3" "4"
  [...]
  .. ..$ fitted.values: Named num [1:3] -1.39e-17 1.28e-01 9.03e-02
  .. .. ..- attr(*, "names")= chr [1:3] "2" "3" "4"
  ..$ Sub:WithinFactor:List of 9
  [...]
  .. ..$ residuals    : Named num [1:4] 0.00964 0.00964 0.23081 -0.23081
  .. .. ..- attr(*, "names")= chr [1:4] "5" "6" "7" "8"
  [...]
  .. ..$ fitted.values: Named num [1:4] 0.0804 -0.0804 -0.0444 -0.0444
  .. .. ..- attr(*, "names")= chr [1:4] "5" "6" "7" "8"
  [...]
  ..$ Within          :List of 6
  [...]
  .. ..$ residuals    : num [1:4, 1] 0.3286 0.1098 -0.4969 0.0564
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:4] "9" "10" "11" "12"
  .. .. .. ..$ : NULL
  .. ..$ fitted.values: num [1:4, 1] 0 0 0 0
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:4] "9" "10" "11" "12"
  .. .. .. ..$ : NULL
  [...]
  ..- attr(*, "error.qr")=List of 5
  .. ..$ qr   : num [1:12, 1:8] -3.464 0.289 0.289 0.289 0.289 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:12] "1" "2" "3" "4" ...
  .. .. .. ..$ : chr [1:8] "(Intercept)" "Sub1" "Sub2" "Sub3" ...
  .. .. ..- attr(*, "assign")= int [1:8] 0 1 1 1 2 2 2 2
  .. .. ..- attr(*, "contrasts")=List of 1
  .. .. .. ..$ Sub: chr "contr.helmert"
  [...]

... none of which looks very helpful to me. So the answer is probably "it's not there", or "not obviously there", and others agree: ggplot2 residuals with ezANOVA

Using afex::aov_ez instead

So you could instead use:

library(afex)
model2 <- aov_ez(
    id = "Sub",  # subject
    dv = "Depvariable",
    data = data,
    between = c("Group"),
    within = c("WithinFactor"),
    type = "III"  # or 3; type III sums of squares
)
anova(model2)
summary(model2)
residuals(model2$lm)

... and that does give you residuals.

However, it also gives different F/p values.

Noting why aov_ez and ezANOVA are giving different answers here

We have:

> mod.ez
$ANOVA
              Effect DFn DFd         SSn        SSd          F         p p<.05         ges
1              Group   1   2 0.024449088 0.05070517 0.96436277 0.4296328       0.134418588
2       WithinFactor   1   2 0.001296481 0.10673345 0.02429382 0.8904503       0.008167579
3 Group:WithinFactor   1   2 0.015557350 0.10673345 0.29151781 0.6433264       0.089928978

> anova(model2)
Anova Table (Type III tests)

Response: Depvariable
                   num Df den Df      MSE      F     ges Pr(>F)
Group              1.0000 2.0000 0.025353 0.9644 0.07197 0.4296
WithinFactor       1.4681 2.9363 0.090093 0.2322 0.08876 0.7471
Group:WithinFactor 1.4681 2.9363 0.090093 1.5001 0.38628 0.3370

Different results. Notice the warning message from mod.ez:

Warning: "WithinFactor" will be treated as numeric

... i.e. as a continuous predictor (covariate), not a discrete predictor (factor). So we should look at the covariate and factorize arguments; see ?aov_ez. I must say that I struggled a bit to work out how to do a within-subjects ANCOVA here. The factorize part applies only to between-subjects predictors, if I read the docs correctly, and similarly covariate is for between-subjects covariates only.

As a quick check, if you use ezANOVA and force it to use WithinFactor as a discrete (not continuous) predictor, like this:

data$WithinCovariate <- data$WithinFactor  # so the name is clearer!
data$WithinFactorDiscrete <- as.factor(data$WithinFactor)
mod.ez.discrete <- ezANOVA(
    data,
    dv = .(Depvariable),
    wid = .(Sub),  # subject
    within = .(WithinFactorDiscrete),  
    between = .(Group),
    type = 3, 
    detailed = TRUE,
    return_aov = TRUE)

... you get F/p values that match aov_ez:

> mod.ez.discrete
$ANOVA
                      Effect DFn DFd        SSn        SSd          F          p p<.05        ges
1                (Intercept)   1   2 0.65723113 0.05070517 25.9236350 0.03647725     * 0.67583504
2                      Group   1   2 0.02444909 0.05070517  0.9643628 0.42963280       0.07197457
3       WithinFactorDiscrete   2   4 0.03070651 0.26453641  0.2321534 0.80280844       0.08876045
4 Group:WithinFactorDiscrete   2   4 0.19841198 0.26453641  1.5000731 0.32651697       0.38627588

So that gets you matching results, and Greenhouse-Geisser/Huynh-Feldt corrections, and residuals, for everything except within-subjects covariates.

Finally...

What does it mean to check sphericity with a continuous within-subjects predictor? I am entirely unclear on that; sphericity relates to homogeneity of variance of the differences between pairs of values at various levels of the within-subjects factor(s). If the predictor is continuous, there are no pairs.

So at the risk of being wrong, I would either (a) trust ezANOVA and forgo residuals; (b) use something that can do everything except the sphericity tests, like this:

library(lme4)
library(lmerTest)  # upgrades reports from lme4 to include p values! ;)

mod.lmer.wscov_interact <- lmer(
    Depvariable ~
        Group * WithinCovariate
        + (1 | Sub),
    data = data
)
anova(mod.lmer.wscov_interact)
residuals(mod.lmer.wscov_interact)

mod.lmer.wscov_no_interact <- lmer(
    Depvariable ~
        Group + WithinCovariate
        + (1 | Sub),
    data = data
)
anova(mod.lmer.wscov_no_interact)

mod.lmer.wsfac <- lmer(
    Depvariable ~
        Group * WithinFactorDiscrete
        + (1 | Sub),
    data = data
)
anova(mod.lmer.wsfac)

giving

> anova(mod.lmer.wscov_interact)
Analysis of Variance Table of type III  with  Satterthwaite 
approximation for degrees of freedom
                        Sum Sq  Mean Sq NumDF DenDF F.value Pr(>F)
Group                 0.033586 0.033586     1     8 0.50936 0.4957
WithinCovariate       0.001296 0.001296     1     8 0.01966 0.8920
Group:WithinCovariate 0.015557 0.015557     1     8 0.23594 0.6402

> residuals(mod.lmer.wscov_interact)
           1            2            3            4            5            6            7            8            9           10           11           12 
 0.130059250 -0.219344250 -0.156546500  0.030059250 -0.261011250  0.476783500 -0.009225679 -0.118156464  0.002383643 -0.059225679  0.323510536 -0.139286357 

> anova(mod.lmer.wscov_no_interact)
Analysis of Variance Table of type III  with  Satterthwaite 
approximation for degrees of freedom
                   Sum Sq   Mean Sq NumDF DenDF F.value Pr(>F)
Group           0.0244491 0.0244491     1     9 0.40519 0.5403
WithinCovariate 0.0012965 0.0012965     1     9 0.02149 0.8867

> anova(mod.lmer.wsfac)
Analysis of Variance Table of type III  with  Satterthwaite 
approximation for degrees of freedom
                             Sum Sq  Mean Sq NumDF DenDF F.value Pr(>F)
Group                      0.024449 0.024449     1     6 0.46534 0.5206
WithinFactorDiscrete       0.030707 0.015353     2     6 0.29222 0.7567
Group:WithinFactorDiscrete 0.198412 0.099206     2     6 1.88819 0.2312

ezANOVA R check error normally distributed

Answers (1)

Related Questions