Medical physicist
Medical physicist

Reputation: 2594

Regression in R with grouped variables

The dependent variable Value of the data frame DF is predicted using the independent variables Mean, X, Y in the following way:

DF <- DF %>% 
    group_by(Country, Sex) %>%
    do({ 
        mod = lm(Value ~ Mean + X + Y, data = .) 
        A <- predict(mod, .)
        data.frame(., A)
    })

Data are grouped by Country and Sex. So, the fitting formula can be expressed as:

Value(Country, Sex) = a0(Country, Sex) + a1(Country, Sex) Mean + a2(Country, Sex) X + a3(Country, Sex) Y

However, I want to use this formula:

Value(Country, Sex) = a0(Country, Sex) + a1(Country, Sex) Mean + a2(Country) X + a3(Country) Y

Where a2 and a3 are independent of Sex. How can I do it?

Upvotes: 1

Views: 204

Answers (1)

Jan van der Laan
Jan van der Laan

Reputation: 8105

I don't think you can when grouping by Country and Sex. You could just group by Country and add interactions with Sex:

DF <- DF %>% 
group_by(Country) %>%
do({ 
    mod = lm(Value ~ Sex + Mean*Sex + X + Y, data = .) 
    A <- predict(mod, .)
    data.frame(., A)
})

or estimate your model in one go adding interactions with Sex and Country:

mod <- lm(Value ~ Sex*Country*Mean + Country*X + Country*Y
A <- predict(mod)

Upvotes: 2

Related Questions