Propensity score matching
In the statisticsstatistical analysisobservational studyobservational data, propensity score matching(PSM) is a Matchingstatisticsstatistical matching technique that attempts toEstimation theoryestimatethe effect a treatment, policyor other intervention by accounting for thecovariates that predict receiving the treatmentbPSM attempts to reduce theBiasstatisticsbias due to confounding variables that could be found in an estimatethe treatment effect obtained from simply comparing outcomes among Statistical unitunits that Treatment and control groupsreceived the treatment versus those that did not. Paul R. Rosenbaumand Donald Rubinintroduced the technique in 1983Cite error: A <ref>
tag is missing the closing </ref>
(see the help page).
Advantages and disadvantages
PSM has been shown to increase model "imbalance, inefficiency, model dependence, and bias" and is no longer recommended compared to other matching methods.[1] The insights behind the use of matching still hold but should be applied with other matching methods; propensity scores also have other productive uses in weighting and doubly robust estimation.
Like other matching procedures, PSM estimates an average treatment effect from observational data. The key advantages of PSM were, at the time of its introduction, that by using a linear combination of covariates for a single score, it balances treatment and control groups on a large number of covariates without losing a large number of observations. If units in the treatment and control were balanced on a large number of covariates one at a time, large numbers of observations would be needed to overcome the "dimensionality problem" whereby the introduction of a new balancing covariate increases the minimum necessary number of observations in the sample geometrically.
One disadvantage of PSM is that it only accounts for observed (and observable) covariates. Factors that affect assignment to treatment and outcome but that cannot be observed cannot be accounted for in the matching procedure.[2] As the procedure only controls for observed variables, any hidden bias due to latent variables may remain after matching.[3] Another issue is that PSM requires large samples, with substantial overlap between treatment and control groups.
General concerns with matching have also been raised by Judea Pearl, who has argued that hidden bias may actually increase because matching on observed variables may unleash bias due to dormant unobserved confounders. Similarly, Pearl has argued that bias reduction can only be assured (asymptotically) by modelling the qualitative causal relationships between treatment, outcome, observed and unobserved covariates.[4] Confounding occurs when the experimenter is unable to control for alternative, non-causal explanations for an observed relationship between independent and dependent variables. Such control should satisfy the "backdoor criterion" of Pearl.[5]
Implementations in statistics packages
- R: propensity score matching is available as part of the
MatchIt
package.[6][7] It can also easily be implemented manually.[8] - SAS: The PSMatch procedure, and macro
OneToManyMTCH
match observations based on a propensity score.[9] - Stata: several commands implement propensity score matching,[10] including the user-written
psmatch2
.[11] Stata version 13 and later also offers the built-in commandteffects psmatch
.[12] - SPSS: A dialog box for Propensity Score Matching is available from the IBM SPSS Statistics menu (Data/Propensity Score Matching), and allows the user to set the match tolerance, randomize case order when drawing samples, prioritize exact matches, sample with or without replacement, set a random seed, and maximize performance by increasing processing speed and minimizing memory usage. The FUZZY Python procedure can also easily be added as an extension to the software through the Extensions dialog box. This procedure matches cases and controls by utilizing random draws from the controls, based on a specified set of key variables. The FUZZY command supports exact and fuzzy matching.
See also
References
- ^ King, Gary; Nielsen, Richard (2019-05-07). "Why Propensity Scores Should Not Be Used for Matching". Political Analysis. 27 (4): 435–454. doi:10.1017/pan.2019.11. ISSN 1047-1987. | link to the full article (from the author's homepage)
- ^ Garrido MM, et al. (2014). "Methods for Constructing and Assessing Propensity Scores". Health Services Research. 49 (5): 1701–20. doi:10.1111/1475-6773.12182. PMC 4213057. PMID 24779867.
- ^ Shadish, W. R.; Cook, T. D.; Campbell, D. T. (2002). Experimental and Quasi-experimental Designs for Generalized Causal Inference. Boston: Houghton Mifflin. ISBN 978-0-395-61556-0.
- ^ Pearl, J. (2009). "Understanding propensity scores". Causality: Models, Reasoning, and Inference (Second ed.). New York: Cambridge University Press. ISBN 978-0-521-89560-6.
- ^ Cite error: The named reference
pearl
was invoked but never defined (see the help page). - ^ Ho, Daniel; Imai, Kosuke; King, Gary; Stuart, Elizabeth (2007). "Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference". Political Analysis. 15 (3): 199–236. doi:10.1093/pan/mpl013.
- ^ "MatchIt: Nonparametric Preprocessing for Parametric Causal Inference". R Project.
- ^ Gelman, Andrew; Hill, Jennifer (2007). Data Analysis Using Regression and Multilevel/Hierarchical Models. New York: Cambridge University Press. pp. 206–212. ISBN 978-0-521-68689-1.
- ^ Parsons, Lori. "Performing a 1:N Case-Control Match on Propensity Score" (PDF). SUGI 29: SAS Institute. Retrieved June 10, 2016.
{{cite web}}
: CS1 maint: location (link) - ^ Implementing Propensity Score Matching Estimators with STATA. Lecture notes 2001
- ^ Leuven, E.; Sianesi, B. (2003). "PSMATCH2: Stata module to perform full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing".
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ "teffects psmatch — Propensity-score matching" (PDF). Stata Manual.
Further reading
- Abadie, Alberto; Imbens, Guido W. (2006). "Large Sample Properties of Matching Estimators for Average Treatment Effects". Econometrica. 74 (1): 235–267. CiteSeerX 10.1.1.559.6313. doi:10.1111/j.1468-0262.2006.00655.x.
- Leite, Walter L. (2017). Practical Propensity Score Methods using R. Washington, DC: Sage Publications. ISBN 978-1-4522-8888-8.