Luckylou
Luckylou

Reputation: 11

Linear mixed-effect models (nlme/ lme4) interpretation of marginal and conditional R^2 values

I'm investigating bacterial diversity and want wo know if the diversity is dependent on the pH. My data are structured as followed:

Now I am searching for a way to answer the question "Is there a effect of the pH on the diversity throughout all studies"

The idea was to use the function lme and set study as random factor. Looking at the data they seem to rather fit a quadratic term than a linear regression, so I tried to calculate the model also with the quadratic term of the pH:

my_model<- lme( fixed = Bacterial_diversity ~ pH +
I(pH^2),
random = ~ pH |Study,
data= my_data)

The output (summary) is:

> summary(my_model)
Linear mixed-effects model fit by REML
  Data: my_data
       AIC      BIC    logLik
  471.7855 497.7353 -228.8928

Random effects:
 Formula: ~pH | Paper
 Structure: General positive-definite, Log-Cholesky parametrization
            StdDev    Corr  
(Intercept) 4.4808759 (Intr)
pH          0.4783127 -0.88 
Residual    0.4154606       

Fixed effects:  Bacterial_diversity ~ pH + I(pH^2) 
                 Value Std.Error  DF   t-value p-value
(Intercept)  1.6641091 1.8078372 285  0.920497  0.3581
pH           1.1750097 0.4670426 285  2.515851  0.0124
I(pH^2)     -0.1187954 0.0363455 285 -3.268508  0.0012
 Correlation: 
        (Intr) pH    
pH      -0.895       
I(pH^2)  0.763 -0.959

Standardized Within-Group Residuals:
       Min         Q1        Med         Q3        Max 
-4.2091539 -0.4574020  0.1168270  0.6216446  2.0828655 

Number of Observations: 304
Number of Groups: 17 

I don't really know how to interpret these results. When I use tab_model(my_model) I end up with two R^2 values. The marginal R^2 (0.025) and the conditional R^2 (0.974), the given p-value is 0.0012 (for the quadratic term of pH) and 0.0124 (for the pH). Are the R^2 values calculated for the whole model? and how can I interpret the R^2 values? Can I say that my data are more likely to fit a quadratic term than a linear regression (due to the more significant p value of the pH^2)?

Is even the model I used the right one or should I have used nlme instead of lme4 (I don't really understand the difference!).

I just want to show the relation between the pH and the bacterial diversity with respect to the different studies

If you have any ideas this would help a lot. And if there are questions regarding the data, the code or anything else, please feel free to ask.

Upvotes: 1

Views: 4219

Answers (1)

Shawn Hemelstrand
Shawn Hemelstrand

Reputation: 3238

Summary and Citations

I recommend using the performance package or even the partR2 package, which use the Nakagawa R2 values of marginal and conditional R2 that you mention. I also advise reading these three papers on the subject, which explain what the values are:

Fitting a Model

I provide worked example with these values and explanations below. First, I load the required packages and model below. For simplicity, I drop the NA values here, but these should be handled with care in a real scenario.

#### Libraries ####
library(lmerTest) # for model fitting
library(performance) # for Nakagawa conditional/marginal R2
library(partR2) # for part R2 values
library(tidyverse) # for tidying data

#### Remove NA Values ####
carrots <- carrots %>% 
  drop_na()

#### Fit Model ####
fit <- lmer(Preference 
           ~ sens2 
           + Homesize 
           + (1 | Consumer),
           data=carrots)

Marginal and Conditional R2

Then we can run a simple summary of pseudo R2.

#### Run Pseudo R2 and Part R2 ####
r2_nakagawa(fit)

Which gives us this:

# R2 for Mixed Models

  Conditional R2: 0.176
     Marginal R2: 0.064

Conditional R2 is the amount of explained variance for the entire model. In this case, both the fixed and random effects explain about 17.6% of the variance of the outcome. The marginal R2 explains how much of this variance is attributed to the fixed effects alone. Here it is a fairly small amount: .064%.

Individual Effects: Part R2

However, this doesn't speak much to the individual effects. For this, partR2 can help answer this question (to a degree). Notice that if you have missing data or random slopes that match the fixed effects, the partR2 function will kick back an error. I have set the boots to 100, but they should be closer to 1000 for a final model (I only do this to save time and computation).

part <- partR2(fit,
       partvars = c("sens2",
                    "Homesize"),
       nboot = 100)

Here you get a summary of the fixed effect sizes here:

summary(part)

Seen below. You'll notice that it automatically provides a marginal R2, which matches with the total partR2 of sens2+Homesize:

R2 (marginal) and 95% CI for the full model: 
 R2     CI_lower CI_upper ndf
 0.0643 0.0407   0.0964   3  

----------

Part (semi-partial) R2:
 Predictor(s)   R2     CI_lower CI_upper ndf
 Model          0.0643 0.0407   0.0964   3  
 sens2          0.0498 0.0262   0.0818   2  
 Homesize       0.0145 0.0000   0.0461   2  
 sens2+Homesize 0.0643 0.0407   0.0964   1  

----------

Inclusive R2 (SC^2 * R2):
 Predictor IR2    CI_lower CI_upper
 sens2     0.0498 0.0307   0.0729  
 Homesize3 0.0145 0.0016   0.0401  

----------

Structure coefficients r(Yhat,x):
 Predictor SC      CI_lower CI_upper
 sens2      0.8804  0.6980   0.9834 
 Homesize3 -0.4749 -0.7164  -0.1814 

----------

Beta weights (standardised estimates)
 Predictor BW      CI_lower CI_upper
 sens2      0.2235  0.1754   0.2703 
 Homesize3 -0.2800 -0.4716  -0.0914 

----------

Here we can see that sens2 contributes more to the model than home size based off the higher explained variance. For more information, refer to the articles I have included above.

Upvotes: 5

Related Questions