JoachimMun
JoachimMun

Reputation: 71

Meta-regression with metafor: Where has the variance gone?

I am wondering about the interpretation of the results of my meta analysis.

The set-up is very simlple: I have a dataset y that includes a variable TVAL (taken from a set of primary studies). TVAL is standardized so that its standard errors (the sampling standard errors from the primary studies) are 1 for all observations. SHORTREF is a factor representing the different primary studies.

I now perform a simple meta regression on a constant, using the metafor package:

> m<-rma.mv(yi=TVAL, V=1, random= ~ 1|SHORTREF, intercept=TRUE, method="REML", data=y)

> summary(m)

Multivariate Meta-Analysis Model (k = 933; method: REML)

    logLik    Deviance         AIC         BIC        AICc  
-3056.0316   6112.0633   6116.0633   6125.7379   6116.0762  

Variance Components: 

            estim    sqrt  nlvls  fixed    factor
sigma^2    6.6821  2.5850     84     no  SHORTREF

Test for Heterogeneity: 
Q(df = 932) = 8115.1664, p-val < .0001

Model Results:

estimate       se     zval     pval    ci.lb    ci.ub          
 -0.5544   0.2861  -1.9375   0.0527  -1.1151   0.0064        . 

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual heterogeneity (sigma^2) is 6.6821. Together with the sampling variance from the primary studies (which is 1 due to the standardization) I get a total variance of 6.6821 + 1 = 7.6821. However, the total variance of my regressand TVAL is var(y$TVAL)=8.70726. That means, I am "missing" a part of my total variance which is neither sampling variance nor residual heterogeneity. This effect does not show up if I leave out the 1|SHORTREF factor:

> m<-rma(yi=TVAL, sei=1, intercept=TRUE, method="REML", data=y)
> summary(m)

Random-Effects Model (k = 933; tau^2 estimator: REML)

    logLik    deviance         AIC         BIC        AICc  
-2330.9480   4661.8959   4665.8959   4675.5706   4665.9088  

tau^2 (estimated amount of total heterogeneity): 7.7073 (SE = 0.4034)
tau (square root of estimated tau^2 value):      2.7762
I^2 (total heterogeneity / total variability):   88.52%
H^2 (total variability / sampling variability):  8.71

Test for Heterogeneity: 
Q(df = 932) = 8115.1664, p-val < .0001

Model Results:

estimate       se     zval     pval    ci.lb    ci.ub          
 -0.5964   0.0966  -6.1731   <.0001  -0.7857  -0.4070      *** 

---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Here, the residual heterogeneity tau^2 is the total variance var(y$TVAL)=8.70726 minus my sampling variance of 1.

Perhaps, it's really simple and I'm just not getting it, but any idea why in the first model (the one with the 1|SHORTREF factor) not the full variance of TVAL is split into the assumed two error components (sampling and residual heterogeneity)?

Thanks! Your help is really appreciated.

Upvotes: 1

Views: 1010

Answers (1)

Wolfgang
Wolfgang

Reputation: 3395

There are three sources of variability here that you need to consider: Heterogeneity between studies, heterogeneity within studies, and sampling error. In your first model, you are neglecting the within-study heterogeneity. You should use something like this:

id <- 1:nrow(y)
m <- rma.mv(yi=TVAL, V=1, random= ~ 1|SHORTREF/id, data=y)

Note that the sum of the two variance components you get from this model (plus 1) will still not exactly add up to var(TVAL) because the decomposition isn't quite so simple. But you should then get something that closely matches the this variance. Here is an example with simulated data:

library(metafor)

k <- 100 ### total number of estimates
m <- 20  ### number of studies

vi <- rep(1, k)
id <- 1:k
study <- sort(sample(1:m, k, replace=TRUE))
yi <- rep(rnorm(length(unique(study)), 0, 2), times=table(study)) + rnorm(k, 0, 4)

var(yi)

res <- rma(yi, vi)
res$tau2 + 1

res <- rma.mv(yi, vi, random = ~ 1 | study/id)
sum(res$sigma2) + 1

Upvotes: 1

Related Questions