Marco Plebani
Marco Plebani

Reputation: 486

R package DHARMa detects deviations from normality even if a mixed model is fitted to gaussian-generated data

[Main edits in italics]

Using R I generated data from normal distributions but, when I fitted a random-intercept mixed-effect model to them, both the Kolmogorov-Smirnoff test for normality and the Quantile-Quantile plot performed by the R package DHARMa indicated deviations from normality. Why is that?

Test 1

I generated a set of random-intercept data in which both the number of replicates and the within-group standard deviation varies for each level of the random effect:

rm(list=ls())
library(DHARMa)
library(glmmTMB)
library(ggplot2)

# "regression" with an 8-level RE with variable slopes:
set.seed(666)
ii=8 # parameter set size; no. groups
reps <- sample(x=8:12, size=ii, replace=T)
slope <- 3
intercept <- runif(ii, min = 70, max = 140)
group.sd <- runif(ii, min = 5, max = 15)
group.ID <- LETTERS[1:ii]

xx <- 1:15

out.list <- list() # empty list to store simulated data

for (i in 1:ii){
        df00 <- data.frame(Group=rep(group.ID[i], reps[i]*length(xx)),
                                    x=rep(xx, reps[i]))
        df00$y = intercept[i] + df00$x * slope + rnorm(length(df00[,1]), 0, group.sd[i])
        out.list[[i]] = df00
        }

# to turn out.list into a data.frame:
df <- do.call(rbind.data.frame, out.list)

# data visualisation
ggplot(df, aes(x = x, y = y, colour = Group)) + 
    geom_point() + geom_smooth(method="lm") + theme_classic()

enter image description here

If I fit a random-intercept model to my dataset:

glmmTMB_ri <- glmmTMB(y~x + (1|Group), data=df)

These are the diagnostic plots and tests produced by DHARMa:

mem1Resids <- simulateResiduals(glmmTMB_ri, n = length(df[, 1]) * 3)
par(mfrow = c(1, 2))
plotQQunif(mem1Resids)
mtext(text = "(a)", side = 3, adj = 0, line = 2)
plotResiduals(mem1Resids, quantreg = T)
mtext(text = "(b)", side = 3, adj = 0, line = 2)

enter image description here

Test 2

I thought that the issue could that the resolution of the simulated residuals was too low. Increasing the number of simulated observations by simulateResiduals() from three times the size of my dataset to 10 times the size of my dataset doesn't improve the diagnostic plots:

enter image description here

Test 3

I thought the issue might not be with DHARMa's settings but with my dataset (perhaps too small or too variable). I generated a new dataset with larger, uniform replication levels for each random group (40 observations for each value of x for each group) and with lower, uniform whithin-group variability (SD=5 for all random groups):

reps <- rep(40, length.out=ii)
# let's also reduce and uniformise maximum within-group variability:
group.sd <- rep(5, ii)

out.list <- list() # empty list to store simulated data

for (i in 1:ii){
        df00 <- data.frame(Group=rep(group.ID[i], reps[i]*length(xx)),
                                    x=rep(xx, reps[i]))
        df00$y = intercept[i] + df00$x * slope + rnorm(length(df00[,1]), 0, group.sd[i])
        out.list[[i]] = df00
        }

# to turn out.list into a data.frame:
df2 <- do.call(rbind.data.frame, out.list)

# data visualisation
ggplot(df2, aes(x = x, y = y, colour = Group)) +
    geom_point() + geom_smooth(method="lm") + theme_classic()

enter image description here

If I fit a random-intercept model to my new dataset:

glmmTMB_ri2 <- glmmTMB(y~x + (1|Group), data=df2)

These are the diagnostic plots and tests produced by DHARMa:

mem2Resids <- simulateResiduals(glmmTMB_ri2, n = length(df2[, 1]) * 3)
par(mfrow = c(1, 2))
plotQQunif(mem2Resids)
mtext(text = "(a)", side = 3, adj = 0, line = 2)
plotResiduals(mem2Resids, quantreg = T)
mtext(text = "(b)", side = 3, adj = 0, line = 2)

enter image description here

If anything, the diagnostic plots show a worse situation than they did before.

Why does DHARMa indicate non-normality of my models' residual's distributions even if I generated my data to be gaussian? If the problem isn't with the sample size or with DHARMa, perhaps I simulated my data incorrectly, but that doesn't seem to be the case either.

Upvotes: 3

Views: 1664

Answers (3)

Ben Bolker
Ben Bolker

Reputation: 226712

To add another perspective, using performance::check_model() (which as @FlorianHartig points out would correspond to using conditional rather than unconditional simulation in DHARMa) on the first model gives:

enter image description here

The two deviations of your simulation from mixed-model assumptions (uniform distribution of REs and group-specific SDs) are indicated by:

  • flatter marginal distribution relative to posterior predictions (top left plot) (both?)
  • slightly decreasing variance with mean (middle left) (both?)
  • fat-tailed Q-Q plot (middle/bottom right) (group SDs?)
  • slightly deviant distribution of RE values (bottom left) (uniform REs?)

Looking at your second set of simulations (homogeneous SDs by group)

enter image description here

  • As far as the conditional performance is concerned, this now looks pretty good (heteroscedasticity and fat tails of residuals are gone)

Upvotes: 2

Florian Hartig
Florian Hartig

Reputation: 206

Disclaimer: I'm the developer of DHARMa.

First of all, for DHARMa-specific questions, please also consider using our issue tracker https://github.com/florianhartig/DHARMa/issues

Second, regarding your question / results: when you simulate with difference SDs for the REs, you violate the assumptions of the mixed models, which assumes REs come from a normal distribution with a common standard deviation. EDIT: plus, it seems to me you are drawing the random intercepts from a uniform distribution, whereas a standard mixed model assumes normal.

Thus, I'm not sure: why are you concerned? I would say it's rather encouraging that this problem is highlighted by DHARMa. If your question is about the specific reason why this is highlighted - the DHARMa default is to re-simulate unconditionally, i.e. with REs. If you would switch to simulations conditional on the fitted REs, which corresponds closer to standard residual checkes, the problem would likely be less visible.

Finally, note that DHARMa has a function createData() that allows you to create data according to the assumptions of a mixed model.

Upvotes: 4

user12256545
user12256545

Reputation: 3022

This is not an error in the implementation of DHARMa but a statistical issue regarding normality testing.

The results of your Kolmogorov-Smirnoff test are what you would expect from it. It looks if the data deviates from H0: the data is normally distributed.

Since you simulate these values with a mean 0 and use a standard deviation between 5 and 15 group.sd <- runif(ii, min = 5, max = 15), if it goes "bad" you have a sd ~15. Further, the more samples you generate using rnorm you will obtain a higher absolute number of points that not fall under the normal distribution, even if the probability and percentage of non-normal points stay the same, ramping up the power of the ks-test (larger N -> more test power).

You can read more on this at:

https://stats.stackexchange.com/questions/333892/is-the-kolmogorov-smirnov-test-too-strict-if-the-sample-size-is-large

And check out suggested alternatives discussed here:

https://stats.stackexchange.com/questions/465196/kolmogorov-smirnov-test-statistic-interpretation-with-large-samples

Upvotes: 1

Related Questions