Reputation:
I want to run a linear regression analysis on my multiple imputed data. I imputed my dataset using mice. The formula I used to run a linear regression on my whole imputed set is as follows:
mod1 <-with(imp, lm(outc ~ age + sex))
pool_mod1 <- pool(mod1)
summary(pool_mod1)
This works fine. Now I want to create a subset of BMI, by saying: I want to apply this regression analysis to the group of people with a BMI below 30 and to the group of people with a BMI above or equal to 30. I tried to do the following:
mod2 <-with(imp, lm(outc ~ age + sex), subset=(bmi<30))
pool_mod2 <- pool(mod2)
summary(pool_mod2)
mod3 <-with(imp, lm(outc ~ age + sex), subset=(bmi>=30))
pool_mod3 <- pool(mod3)
summary(pool_mod3)
I do not get an error, but the problem is: all three analysis give me exactly the same results. I thought this could be just the real life situation, however, if I use variables other than bmi (like blood pressure < 150), the same thing happens to me.
So my question is: how can I do subset analysis in R when the data is imputed using mice?
(BMI is imputed as well, I do not know if that is a problem?)
Upvotes: 2
Views: 2688
Reputation: 2091
You should place subset
within lm()
, not outside of it.
with(imp, lm(outc ~ age + sex, subset=(bmi<30)))
A reproducible example.
with(mtcars, lm(mpg ~ disp + hp)) # Both produce the same
with(mtcars, lm(mpg ~ disp + hp), subset=(cyl < 6))
Coefficients:
(Intercept) disp hp
30.73590 -0.03035 -0.02484
with(mtcars, lm(mpg ~ disp + hp, subset=(cyl < 6))) # Calculates on the subset
Coefficients:
(Intercept) disp hp
43.04006 -0.11954 -0.04609
Upvotes: 2