Reputation: 49
I have a dataset that is in a long format with 200 variables, 94 subjects, and each subject has anywhere from 1 to 3 measurements for each variable.
Eg:
ID measurement var1 var2 . . .
1 1 2 6
1 2 3 8
1 3 6 12
2 1 3 9
2 2 4 4
2 3 5 3
3 1 1 11
3 2 1 4
. . . .
. . . .
. . . .
However, some variables have missing values for one of three measurements. It was suggested to me that before imputing missing values with the mean for the subject, I should use a repeated measures ANOVA or mixed model in order to confirm the repeatability of measurements.
The first thing I found to calculate the ICC was the ICC() function from the psych package. However, from what I can tell this requires that the data have one row per subject and one column per measurement, which would be further complicated by the fact that I have 200 variables I need to calculate the ICC for individually. I did go ahead and calculate the ICC for a single variable, and obtained this output:
Intraclass correlation coefficients
type ICC F df1 df2 p lower bound upper bound
Single_raters_absolute ICC1 0.38 2.8 93 188 0.00000000067 0.27 0.49
Single_random_raters ICC2 0.38 2.8 93 186 0.00000000068 0.27 0.49
Single_fixed_raters ICC3 0.38 2.8 93 186 0.00000000068 0.27 0.49
Average_raters_absolute ICC1k 0.65 2.8 93 188 0.00000000067 0.53 0.74
Average_random_raters ICC2k 0.65 2.8 93 186 0.00000000068 0.53 0.74
Average_fixed_raters ICC3k 0.65 2.8 93 186 0.00000000068 0.53 0.74
Number of subjects = 94 Number of Judges = 3
Next, I tried to calculate the ICC using a mixed model. Using this code:
m1 <- lme(var1 ~ measurement, random=~1|ID, data=mydata, na.action=na.omit)
summary(m1)
The output looks like this:
Linear mixed-effects model fit by REML
Data: mydata
AIC BIC logLik
-1917.113 -1902.948 962.5564
Random effects:
Formula: ~1 | ORIGINAL_ID
(Intercept) Residual
StdDev: 0.003568426 0.004550419
Fixed effects: var1 ~ measurement
Value Std.Error DF t-value p-value
(Intercept) 0.003998953 0.0008388997 162 4.766902 0.0000
measurement 0.000473053 0.0003593452 162 1.316429 0.1899
Correlation:
(Intr)
measurement -0.83
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.35050264 -0.30417725 -0.03383329 0.25106803 12.15267443
Number of Observations: 257
Number of Groups: 94
Is this the correct model to use to assess ICC? It is not clear to me what the correlation (Intr) is measuring, and it is different from the ICC obtained using ICC().
This is my first time calculating and using intraclass correlation, so any help is appreciated!
Upvotes: 2
Views: 5365
Reputation: 740
Using a mock dataset...
set.seed(42)
n <- 6
dat <- data.frame(id=rep(1:n, 2),
group= as.factor(rep(LETTERS[1:2], n/2)),
V1 = rnorm(n),
V2 = runif(n*2, min=0, max=100),
V3 = runif(n*2, min=0, max=100),
V4 = runif(n*2, min=0, max=100),
V5 = runif(n*2, min=0, max=100))
Loading some libraries...
library(lme4)
library(purrr)
library(tidyr)
# Add list of variable names to the vector below...
var_list <- c("V1","V2","V3","V4","V5")
map_dfr() is from the purrr library. I use lme4::VarCorr() to get the variances at each level.
map_dfr(var_list,
function(x){
formula_mlm = as.formula(paste0(x,"~ group + (1|id)"));
model_fit = lmer(formula_mlm,data=dat);
re_variances = VarCorr(model_fit,comp="Variance") %>%
data.frame() %>%
dplyr::mutate(variable = x);
return(re_variances)
}) %>%
dplyr::select(variable,grp,vcov) %>%
pivot_wider(names_from="grp",values_from="vcov") %>%
dplyr::mutate(icc = id/(id+Residual))
Upvotes: 0