ecjb
ecjb

Reputation: 5449

3:1 matching with MatchIt in R. The number of matched control is not equal to 3 times the number of cases

Hello I'm using the MatchIt package in R

I have a total of unmatched 116 treated cases and 462 unmatcehd non-treated case

with the command

mod_match_logit = matchit(f.build("treatement_yes_or_no", covariates), 
method = "nearest", distance = "logit", data = df, caliper = 0.05, ratio = 3)

I get then a result of 91 matched treated cases with 248 matched non treated cases. What I don't understand is that, with a 3:1 matching, I should have 91*3 = 273 matched non treated cases (and not 248). Per default is the command replace set to F in MatchIt, so it doesn't explain the difference for me. What am I missing?

Upvotes: 1

Views: 2399

Answers (1)

Jason Johnson
Jason Johnson

Reputation: 451

Without seeing the data I am only guessing but it is most likely due to your caliper setting.

MatchIt defines the caliper as "the number of standard deviations of the distance measure within which to draw control units (default = 0, no caliper matching)"(p.26)

Therefore my guess is you have some units in the treatment group with high propensity scores that cannot be matched to those in the untreated group (at least within 0.05 standard deviations as you specified). The reason why you are not getting 273 subjects in your matched data set is because of the caliper = 0.05 setting in your MatchIt call. Some of the treated subjects with higher propensity scores that are still getting matched to at least one untreated are unable to get matched to a second or third because they are beyond the 0.05 caliper specification. Maybe increasing the caliper would retain more treated subjects but I would not go any higher than 0.25 based on best practices documented in the literature.

Depending on your research design you could consider using other matching methods. For example you could use distances other than euclidean such as mahalanobis which is an option in MatchIt. Alternatively, you could also use either optimal full matching or optimal pair matching from the 'optmatch' library though you can also call those through the MatchIt function. There are many other approaches but these are easily accessible from the MatchIt library. The literature does suggest trying a few different methods and then checking for balance as long as you do not "cherry-pick" the one that gives you the largest effect. In other words select your matched set based on covariate balance and not on the outcome variable in your study. There is definitely a bit of an art to propensity score matching but is why I think it so interesting!

Upvotes: 2

Related Questions