Reputation: 11
I'm investigating bacterial diversity and want wo know if the diversity is dependent on the pH. My data are structured as followed:
Now I am searching for a way to answer the question "Is there a effect of the pH on the diversity throughout all studies"
The idea was to use the function lme
and set study as random factor. Looking at the data they seem to rather fit a quadratic term than a linear regression, so I tried to calculate the model also with the quadratic term of the pH:
my_model<- lme( fixed = Bacterial_diversity ~ pH +
I(pH^2),
random = ~ pH |Study,
data= my_data)
The output (summary) is:
> summary(my_model)
Linear mixed-effects model fit by REML
Data: my_data
AIC BIC logLik
471.7855 497.7353 -228.8928
Random effects:
Formula: ~pH | Paper
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 4.4808759 (Intr)
pH 0.4783127 -0.88
Residual 0.4154606
Fixed effects: Bacterial_diversity ~ pH + I(pH^2)
Value Std.Error DF t-value p-value
(Intercept) 1.6641091 1.8078372 285 0.920497 0.3581
pH 1.1750097 0.4670426 285 2.515851 0.0124
I(pH^2) -0.1187954 0.0363455 285 -3.268508 0.0012
Correlation:
(Intr) pH
pH -0.895
I(pH^2) 0.763 -0.959
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-4.2091539 -0.4574020 0.1168270 0.6216446 2.0828655
Number of Observations: 304
Number of Groups: 17
I don't really know how to interpret these results. When I use tab_model(my_model)
I end up with two R^2 values. The marginal R^2 (0.025) and the conditional R^2 (0.974), the given p-value is 0.0012 (for the quadratic term of pH) and 0.0124 (for the pH).
Are the R^2 values calculated for the whole model? and how can I interpret the R^2 values? Can I say that my data are more likely to fit a quadratic term than a linear regression (due to the more significant p value of the pH^2)?
Is even the model I used the right one or should I have used nlme instead of lme4 (I don't really understand the difference!).
I just want to show the relation between the pH and the bacterial diversity with respect to the different studies
If you have any ideas this would help a lot. And if there are questions regarding the data, the code or anything else, please feel free to ask.
Upvotes: 1
Views: 4219
Reputation: 3238
I recommend using the performance
package or even the partR2
package, which use the Nakagawa R2 values of marginal and conditional R2 that you mention. I also advise reading these three papers on the subject, which explain what the values are:
I provide worked example with these values and explanations below. First, I load the required packages and model below. For simplicity, I drop the NA values here, but these should be handled with care in a real scenario.
#### Libraries ####
library(lmerTest) # for model fitting
library(performance) # for Nakagawa conditional/marginal R2
library(partR2) # for part R2 values
library(tidyverse) # for tidying data
#### Remove NA Values ####
carrots <- carrots %>%
drop_na()
#### Fit Model ####
fit <- lmer(Preference
~ sens2
+ Homesize
+ (1 | Consumer),
data=carrots)
Then we can run a simple summary of pseudo R2.
#### Run Pseudo R2 and Part R2 ####
r2_nakagawa(fit)
Which gives us this:
# R2 for Mixed Models
Conditional R2: 0.176
Marginal R2: 0.064
Conditional R2 is the amount of explained variance for the entire model. In this case, both the fixed and random effects explain about 17.6% of the variance of the outcome. The marginal R2 explains how much of this variance is attributed to the fixed effects alone. Here it is a fairly small amount: .064%.
However, this doesn't speak much to the individual effects. For this, partR2
can help answer this question (to a degree). Notice that if you have missing data or random slopes that match the fixed effects, the partR2
function will kick back an error. I have set the boots to 100, but they should be closer to 1000 for a final model (I only do this to save time and computation).
part <- partR2(fit,
partvars = c("sens2",
"Homesize"),
nboot = 100)
Here you get a summary of the fixed effect sizes here:
summary(part)
Seen below. You'll notice that it automatically provides a marginal R2, which matches with the total partR2 of sens2+Homesize
:
R2 (marginal) and 95% CI for the full model:
R2 CI_lower CI_upper ndf
0.0643 0.0407 0.0964 3
----------
Part (semi-partial) R2:
Predictor(s) R2 CI_lower CI_upper ndf
Model 0.0643 0.0407 0.0964 3
sens2 0.0498 0.0262 0.0818 2
Homesize 0.0145 0.0000 0.0461 2
sens2+Homesize 0.0643 0.0407 0.0964 1
----------
Inclusive R2 (SC^2 * R2):
Predictor IR2 CI_lower CI_upper
sens2 0.0498 0.0307 0.0729
Homesize3 0.0145 0.0016 0.0401
----------
Structure coefficients r(Yhat,x):
Predictor SC CI_lower CI_upper
sens2 0.8804 0.6980 0.9834
Homesize3 -0.4749 -0.7164 -0.1814
----------
Beta weights (standardised estimates)
Predictor BW CI_lower CI_upper
sens2 0.2235 0.1754 0.2703
Homesize3 -0.2800 -0.4716 -0.0914
----------
Here we can see that sens2
contributes more to the model than home size based off the higher explained variance. For more information, refer to the articles I have included above.
Upvotes: 5