Borja
Borja

Reputation: 73

How to obtain adjusted dependent variables

Given the following dataset:

csf     age    sex  tiv   group
0,30    7,92    1   1,66    1
0,26    33,75   0   1,27    3
0,18    7,83    0   1,43    2
0,20    9,42    0   1,70    1
0,29    22,33   1   1,68    2
0,40    20,75   1   1,56    1
0,26    13,25   0   1,68    1
0,28    6,67    0   1,66    1
0,22    10,58   0   1,38    1
0,22    13,08   0   1,41    2
0,33    36,42   1   1,68    3
0,29    35,00   1   1,34    3
0,11    7,25    1   1,20    2
0,13    10,00   0   1,12    3
0,32    34,58   1   1,33    3
0,68    8,25    1   1,90    1
0,25    11,08   1   1,92    2
0,33    10,92   0   1,24    1
0,20    9,33    1   1,58    1
0,25    51,67   0   1,15    3
0,16    27,67   0   1,19    3
0,19    33,25   0   1,29    3
0,16    7,92    1   1,67    1
0,17    13,42   0   1,34    3
0,45    48      1   1,85    1
0,34    14,67   1   1,80    1
0,23    35,33   0   1,31    3
0,18    15,50   1   1,59    1
0,11    12,08   0   1,34    2
0,21    9,92    0   1,43    1
0,19    8,83    0   1,59    1
0,21    6,83    1   1,78    1
0,13    10      0   1,28    1
0,38    38,42   1   1,63    3
0,27    13,83   0   1,63    1
0,28    15,33   0   1,43    2
0,31    38      1   1,70    1
0,19    13,08   0   1,56    1
0,13    26,25   0   1,07    3
0,14    63,08   1   1,34    3
0,19    10,25   1   1,27    3
0,38    37,25   1   1,63    3
0,28    37,33   0   1,47    3
0,34    20,25   1   1,41    2
0,36    40,33   1   1,44    3
0,26    42,83   0   1,43    2
0,29    46,08   1   1,74    2
0,19    10,25   0   1,56    1
0,20    12,08   1   1,76    1
0,29    30,58   1   1,39    3
0,23    44,67   1   1,45    3

I want to know whether CSF is different between groups. But I know that CSF is highly affected by age, sex, and tiv. So, I would like to plot the differences between groups beyond the influence of age, sex, and tiv. To that end, I need to adjust CSF for those three covariates. My question is: how can I obtain, for each individual, his/her adjusted CSF value?

I did the following linear model:

model1 <- lm(csf ~ age + sex + tiv,data=mri22))

And used the sum of (residuals+intercept) in order to obtain the csf value free from the effects of age, sex, and tiv:

csf_adj <- resid(model1) + coef(model1)[1]

However, I get many negative values that make no sense, given that CSF cannot be negative. So my question is: how can I obtain the good CSF values adjusted for all three covariates?

Upvotes: 2

Views: 6500

Answers (3)

Parthiban Bala
Parthiban Bala

Reputation: 25

Although it's too late..

Your model is csf depends linearly on age, sex and tiv. This should explain some percentage of variance of data. Remaining percentage of variance will be in residuals.

Csf = a.age + b.sex + c.tiv + d is the model. If r is the residual, then, Predicted csf based on model is a.age + b.sex + c.tiv + d, while observed csf (data you have) is a.age + b.sex + c.tiv + d + r.

Now if you want to control for age, sex and tiv replace the individual with their corresponding means. For example, Adjusted icv = a.(mean of age) + b.(mean of sex) + c.(mean of tiv) + d + r.

Now this adjusted csf will have variation due to anything other than age, sex or tiv.

Upvotes: 0

Robert
Robert

Reputation: 5152

As @Gopala said, apparently there is no effect of group in the intercept. Also there is no effect on the responses (coefficients). You can see this in plots and statistical tests.

mri22$group <- as.factor(mri22$group)
plot(mri22)
plot(csf~group,data=mri22,col=mri22$group)

plot(csf~age,data=mri22,col=mri22$group)
plot(csf~sex,data=mri22,col=mri22$group)
plot(csf~tiv,data=mri22,col=mri22$group)

model1 <- lm(csf ~ age + sex + tiv,data=mri22)
summary(model1)

model2 <- lm(csf ~ 0+age + sex + tiv+group,data=mri22)
summary(model2)
model3 <- lm(csf ~ 0+age*I(group) + sex + tiv,data=mri22)
summary(model3)
model4 <- lm(csf ~ 0+age*I(group) + sex*I(group) + tiv*I(group),data=mri22)
summary(model4)


Coefficients:
                Estimate Std. Error t value Pr(>|t|)  
age            0.0025507  0.0020500   1.244   0.2208  
I(group)1     -0.1902470  0.2174566  -0.875   0.3870  
I(group)2     -0.0076027  0.2224419  -0.034   0.9729  
I(group)3     -0.2303957  0.1993927  -1.155   0.2549  
sex            0.0208069  0.0480609   0.433   0.6675  
tiv            0.2552315  0.1428288   1.787   0.0817 .
age:I(group)2 -0.0002252  0.0030392  -0.074   0.9413  
age:I(group)3 -0.0021075  0.0026656  -0.791   0.4339  
I(group)2:sex -0.0048219  0.0790885  -0.061   0.9517  
I(group)3:sex -0.0014738  0.0711362  -0.021   0.9836  
I(group)2:tiv -0.1307945  0.2153850  -0.607   0.5472  
I(group)3:tiv  0.0796898  0.2143078   0.372   0.7120  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Upvotes: 1

Gopala
Gopala

Reputation: 10473

You can run a regression like this and it will tell you whether group is significant. Here, it shows that it is not:

df$group <- as.factor(df$group)
fit <- lm(csf ~ age + sex + tiv + group, data = df)
summary(fit)

Call:
lm(formula = csf ~ age + sex + tiv + group, data = df)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.12429 -0.04760 -0.00306  0.01967  0.34004 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)  
(Intercept) -0.155845   0.147860  -1.054   0.3000  
age          0.002895   0.001539   1.881   0.0694 .
sex          0.019926   0.036502   0.546   0.5891  
tiv          0.237891   0.097655   2.436   0.0208 *
group2      -0.037555   0.040104  -0.936   0.3563  
group3      -0.013844   0.051717  -0.268   0.7907  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.0874 on 31 degrees of freedom
Multiple R-squared:  0.4342,    Adjusted R-squared:  0.3429 
F-statistic: 4.757 on 5 and 31 DF,  p-value: 0.00243

Upvotes: 0

Related Questions