Confounding variables in Propensity Score Matching- balanced variables have a Standardized Mean Difference btw Treatment & Control 0.1

Question

This is a Causal Inference related question, specifically on how to handle unbalanced variables. I applied an XGBoost model to create propensity scores for users (found that XGBoost had higher accuracy, precision & AUC compared to Logistic Regression). When estimating the Standardized Mean Differences (SMDs) for the balanced variables between the control & treatment, there is one feature (user age, ranked high in feature gain/importance) which is above the SMD threshold of 0.1. Some things I have tried to remediate this:

Increased the sample size of control and treatment
Downsampled to ensure treated users are not duplicatively matched to control users
Re-sampled the data to make sure the training data age group distribution is proportionally the same for control and treatment

I'm stuck! How can I make sure that the SMD for user age is below 0.1? Unsure how to move forward with this confounding variable, as it is a highly important feature. Any help would be greatly appreciated.

Confounding variables in Propensity Score Matching- balanced variables have a Standardized Mean Difference btw Treatment & Control > 0.1

Answers (1)

Related Questions

Confounding variables in Propensity Score Matching- balanced variables have a Standardized Mean Difference btw Treatment &amp; Control &gt; 0.1

Answers (1)

Related Questions

Confounding variables in Propensity Score Matching- balanced variables have a Standardized Mean Difference btw Treatment & Control > 0.1