Friedebert
Friedebert

Reputation: 11

Should an unbalanced variable be used as independent variable in a linear model used for inference statistics?

Hej,

Situation: I am trying to create a linear (mixed) model. I know from previous research that a person's sex and disease severity influence the outcome variable. The variable I am interested in is called treatment. I am planning to use a likelihood ratio test to test for treatment effects.

Problem: The study that I am analysing right now is not well balanced:

Questions:

I am not directly interested in the effects of sex and disease severity on the outcome as my focus is the treatment effect on the outcome, I am mainly interested in creating a model that makes both medical and statistical sense.

Thank you so much :)

Upvotes: 1

Views: 278

Answers (1)

Robert Long
Robert Long

Reputation: 6887

Should sex be considered an independent variable in a linear (mixed) model that is used for inference statistics?

Should it be avoided to include both variables due to collinearity and could sex then "cover both", disease and sex specific (confounding) effects?

Typically sex is included as a regressor in a regression model because it is a potential confounder. Sex is often causally associated with both the outcome and the main exposure, so yes, if this is likely in your study then it should be included. It doesn't matter that the design is unbalanced - mixed models are able to handle that. Multicollinearity would only be a problem if the correlation is extremely high. We expect there to be a certain amount of correlation.

I would recommend drawing a DAG to determine which variables to include/exclusion. Please see this thread for details of how and why to do that:
How do DAGs help to reduce bias in causal inference?

Upvotes: 0

Related Questions