Reputation: 73
Given the following dataset:
csf age sex tiv group
0,30 7,92 1 1,66 1
0,26 33,75 0 1,27 3
0,18 7,83 0 1,43 2
0,20 9,42 0 1,70 1
0,29 22,33 1 1,68 2
0,40 20,75 1 1,56 1
0,26 13,25 0 1,68 1
0,28 6,67 0 1,66 1
0,22 10,58 0 1,38 1
0,22 13,08 0 1,41 2
0,33 36,42 1 1,68 3
0,29 35,00 1 1,34 3
0,11 7,25 1 1,20 2
0,13 10,00 0 1,12 3
0,32 34,58 1 1,33 3
0,68 8,25 1 1,90 1
0,25 11,08 1 1,92 2
0,33 10,92 0 1,24 1
0,20 9,33 1 1,58 1
0,25 51,67 0 1,15 3
0,16 27,67 0 1,19 3
0,19 33,25 0 1,29 3
0,16 7,92 1 1,67 1
0,17 13,42 0 1,34 3
0,45 48 1 1,85 1
0,34 14,67 1 1,80 1
0,23 35,33 0 1,31 3
0,18 15,50 1 1,59 1
0,11 12,08 0 1,34 2
0,21 9,92 0 1,43 1
0,19 8,83 0 1,59 1
0,21 6,83 1 1,78 1
0,13 10 0 1,28 1
0,38 38,42 1 1,63 3
0,27 13,83 0 1,63 1
0,28 15,33 0 1,43 2
0,31 38 1 1,70 1
0,19 13,08 0 1,56 1
0,13 26,25 0 1,07 3
0,14 63,08 1 1,34 3
0,19 10,25 1 1,27 3
0,38 37,25 1 1,63 3
0,28 37,33 0 1,47 3
0,34 20,25 1 1,41 2
0,36 40,33 1 1,44 3
0,26 42,83 0 1,43 2
0,29 46,08 1 1,74 2
0,19 10,25 0 1,56 1
0,20 12,08 1 1,76 1
0,29 30,58 1 1,39 3
0,23 44,67 1 1,45 3
I want to know whether CSF is different between groups. But I know that CSF is highly affected by age, sex, and tiv. So, I would like to plot the differences between groups beyond the influence of age, sex, and tiv. To that end, I need to adjust CSF for those three covariates. My question is: how can I obtain, for each individual, his/her adjusted CSF value?
I did the following linear model:
model1 <- lm(csf ~ age + sex + tiv,data=mri22))
And used the sum of (residuals+intercept) in order to obtain the csf value free from the effects of age, sex, and tiv:
csf_adj <- resid(model1) + coef(model1)[1]
However, I get many negative values that make no sense, given that CSF cannot be negative. So my question is: how can I obtain the good CSF values adjusted for all three covariates?
Upvotes: 2
Views: 6500
Reputation: 25
Although it's too late..
Your model is csf depends linearly on age, sex and tiv. This should explain some percentage of variance of data. Remaining percentage of variance will be in residuals.
Csf = a.age + b.sex + c.tiv + d is the model. If r is the residual, then, Predicted csf based on model is a.age + b.sex + c.tiv + d, while observed csf (data you have) is a.age + b.sex + c.tiv + d + r.
Now if you want to control for age, sex and tiv replace the individual with their corresponding means. For example, Adjusted icv = a.(mean of age) + b.(mean of sex) + c.(mean of tiv) + d + r.
Now this adjusted csf will have variation due to anything other than age, sex or tiv.
Upvotes: 0
Reputation: 5152
As @Gopala said, apparently there is no effect of group in the intercept. Also there is no effect on the responses (coefficients). You can see this in plots and statistical tests.
mri22$group <- as.factor(mri22$group)
plot(mri22)
plot(csf~group,data=mri22,col=mri22$group)
plot(csf~age,data=mri22,col=mri22$group)
plot(csf~sex,data=mri22,col=mri22$group)
plot(csf~tiv,data=mri22,col=mri22$group)
model1 <- lm(csf ~ age + sex + tiv,data=mri22)
summary(model1)
model2 <- lm(csf ~ 0+age + sex + tiv+group,data=mri22)
summary(model2)
model3 <- lm(csf ~ 0+age*I(group) + sex + tiv,data=mri22)
summary(model3)
model4 <- lm(csf ~ 0+age*I(group) + sex*I(group) + tiv*I(group),data=mri22)
summary(model4)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
age 0.0025507 0.0020500 1.244 0.2208
I(group)1 -0.1902470 0.2174566 -0.875 0.3870
I(group)2 -0.0076027 0.2224419 -0.034 0.9729
I(group)3 -0.2303957 0.1993927 -1.155 0.2549
sex 0.0208069 0.0480609 0.433 0.6675
tiv 0.2552315 0.1428288 1.787 0.0817 .
age:I(group)2 -0.0002252 0.0030392 -0.074 0.9413
age:I(group)3 -0.0021075 0.0026656 -0.791 0.4339
I(group)2:sex -0.0048219 0.0790885 -0.061 0.9517
I(group)3:sex -0.0014738 0.0711362 -0.021 0.9836
I(group)2:tiv -0.1307945 0.2153850 -0.607 0.5472
I(group)3:tiv 0.0796898 0.2143078 0.372 0.7120
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Upvotes: 1
Reputation: 10473
You can run a regression like this and it will tell you whether group is significant. Here, it shows that it is not:
df$group <- as.factor(df$group)
fit <- lm(csf ~ age + sex + tiv + group, data = df)
summary(fit)
Call:
lm(formula = csf ~ age + sex + tiv + group, data = df)
Residuals:
Min 1Q Median 3Q Max
-0.12429 -0.04760 -0.00306 0.01967 0.34004
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.155845 0.147860 -1.054 0.3000
age 0.002895 0.001539 1.881 0.0694 .
sex 0.019926 0.036502 0.546 0.5891
tiv 0.237891 0.097655 2.436 0.0208 *
group2 -0.037555 0.040104 -0.936 0.3563
group3 -0.013844 0.051717 -0.268 0.7907
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.0874 on 31 degrees of freedom
Multiple R-squared: 0.4342, Adjusted R-squared: 0.3429
F-statistic: 4.757 on 5 and 31 DF, p-value: 0.00243
Upvotes: 0