×

Causal mediation analysis: from simple to more robust strategies for estimation of marginal natural (in)direct effects. (English) Zbl 07649352

Summary: This paper aims to provide practitioners of causal mediation analysis with a better understanding of estimation options. We take as inputs two familiar strategies (weighting and model-based prediction) and a simple way of combining them (weighted models), and show how a range of estimators can be generated, with different modeling requirements and robustness properties. The primary goal is to help build intuitive appreciation for robust estimation that is conducive to sound practice. We do this by visualizing the target estimand and the estimation strategies. A second goal is to provide a “menu” of estimators that practitioners can choose from for the estimation of marginal natural (in)direct effects. The estimators generated from this exercise include some that coincide or are similar to existing estimators and others that have not previously appeared in the literature. We note several different ways to estimate the weights for cross-world weighting based on three expressions of the weighting function, including one that is novel; and show how to check the resulting covariate and mediator balance. We use a random continuous weights bootstrap to obtain confidence intervals, and also derive general asymptotic variance formulas for the estimators. The estimators are illustrated using data from an adolescent alcohol use prevention study. R-code is provided.

MSC:

62D20 Causal inference from observational studies

References:

[1] ALBERT, J. M. (2012). Distribution-free mediation analysis for nonlinear models with confounding. Epidemiology 23 879-88. · doi:10.1097/EDE.0b013e31826c2bb9
[2] DIDELEZ, V., DAWID, A. P. and GENELETTI, S. (2006). Direct and Indirect Effects of Sequential Treatments. In Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence 138-146. AUAI Press.
[3] EFRON, B. (1979). Bootstrap Methods: Another Look at the Jackknife. The Annals of Statistics 11 1-26. · Zbl 0406.62024
[4] GREIFER, N. (2022). cobalt: Covariate Balance Tables and Plots R package version 4.3.2.
[5] HAINMUELLER, J. (2012). Entropy balancing for causal effects: A multivariate reweighting method to produce balanced samples in observational studies. Political Analysis 20 25-46. · doi:10.1093/pan/mpr025
[6] HOLLAND, P. W. (1986). Statistics and Causal Inference. Journal of the American Statistical Association 81 945. · Zbl 0607.62001 · doi:10.2307/2289064
[7] HONG, G. (2010). Ratio of mediator probability weighting for estimating natural direct and indirect effects. In Proceedings of the American Statistical Association, Biometrics Section 2401-2415.
[8] HONG, G., DEUTSCH, J. and HILL, H. D. (2015). Ratio-of-mediator-probability weighting for causal mediation analysis in the presence of treatment-by-mediator interaction. Journal of Educational and Behavioral Statistics 40 307-340. · doi:10.3102/1076998615583902
[9] HONG, G., QIN, X. and YANG, F. (2018). Weighting-Based Sensitivity Analysis in Causal Mediation Studies. Journal of Educational and Behavioral Statistics 43 32-56. · doi:10.3102/1076998617749561
[10] HONG, G., YANG, F. and QIN, X. (2021). Post-Treatment Confounding in Causal Mediation Studies: A Cutting-Edge Problem and A Novel Solution via Sensitivity Analysis. · doi:10.48550/arXiv.2107.11014
[11] HONG, G., YANG, F. and QIN, X. (2021). Did you conduct a sensitivity analysis? A new weighting-based approach for evaluations of the average treatment effect for the treated. Journal of the Royal Statistical Society: Series A (Statistics in Society) 184 227-254. · doi:10.1111/rssa.12621
[12] HUBER, M. (2014). Identifying causal mechanisms (primarily) based on inverse probability weighting. Journal of Applied Econometrics 29 920-943. · doi:10.1002/jae.2341
[13] HUBER, M. (2020). Mediation Analysis. In Handbook of Labor, Human Resources and Population Economics (K. F. Zimmermann, ed.) Springer. · doi:10.1007/978-3-319-57365-6_162-1
[14] HULING, J. D. and MAK, S. (2020). Energy balancing of covariate distributions. arXiv 1-68.
[15] IMAI, K., KEELE, L. and TINGLEY, D. (2010). A general approach to causal mediation analysis. Psychological Methods 15 309-34. · doi:10.1037/a0020761
[16] IMAI, K., KEELE, L. and YAMAMOTO, T. (2010). Identification, inference and sensitivity analysis for causal mediation effects. Statistical Science 25 51-71. · Zbl 1328.62478 · doi:10.1214/10-STS321
[17] IMAI, K. and RATKOVIC, M. (2014). Covariate balancing propensity score. Journal of the Royal Statistical Society. Series B: Statistical Methodology 76 243-263. · Zbl 1411.62025
[18] JACKSON, J. W. (2021). Meaningful Causal Decompositions in Health Equity Research: Definition, Identification, and Estimation Through a Weighting Framework. Epidemiology 32 282-290. · doi:10.1097/EDE.0000000000001319
[19] KANG, J. D. Y. and SCHAFER, J. L. (2007). Demystifying Double Robustness: A Comparison of Alternative Strategies for Estimating a Population Mean from Incomplete Data. Statistical Science 22 523-539. · Zbl 1246.62073 · doi:10.1214/07-sts227rej
[20] KONING, I. M., VAN DEN EIJNDEN, R. J., VERDURMEN, J. E., ENGELS, R. C. and VOLLEBERGH, W. A. (2011). Long-term effects of a parent and student intervention on alcohol use in adolescents: A cluster randomized controlled trial. American Journal of Preventive Medicine 40 541-547. · doi:10.1016/j.amepre.2010.12.030
[21] KONING, I. M., VAN DEN EIJNDEN, R. J. J. M., ENGELS, R. C. M. E., VERDURMEN, J. E. E. and VOLLEBERGH, W. A. M. (2010). Why target early adolescents and parents in alcohol prevention? The mediating effects of self-control, rules and attitudes about alcohol use. Addiction 106 538-46. · doi:10.1111/j.1360-0443.2010.03198.x
[22] KONING, I. M., VOLLEBERGH, W. A. M., SMIT, F., VERDURMEN, J. E. E., VAN DEN EIJNDEN, R. J. J. M., TER BOGT, T. F. M., STATTIN, H. and ENGELS, R. C. M. E. (2009). Preventing heavy alcohol use in adolescents (PAS): cluster randomized trial of a parent and student intervention offered separately and simultaneously. Addiction 104 1669-78. · doi:10.1111/j.1360-0443.2009.02677.x
[23] LANGE, T., VANSTEELANDT, S. and BEKAERT, M. (2012). A simple unified approach for estimating natural direct and indirect effects. American Journal of Epidemiology 176 190-195. · doi:10.1093/aje/kwr525
[24] MILES, C., KANKI, P., MELONI, S. and TCHETGEN TCHETGEN, E. (2017). On Partial Identification of the Natural Indirect Effect. Journal of Causal Inference 5. · doi:10.1515/jci-2016-0004
[25] MUTHÉN, B. O. and ASPAROUHOV, T. (2015). Causal effects in mediation modeling: An introduction with applications to latent variables. Structural Equation Modeling 22 12-23. · doi:10.1080/10705511.2014.935843
[26] NGUYEN, T. Q., OGBURN, E. L., SCHMID, I., SARKER, E. B., GREIFER, N., KONING, I. M. and STUART, E. A. (2022). Causal mediation analysis: From simple to more robust strategies for estimation of marginal natural (in)direct effects. arXiv:2102.06048. Version 3.
[27] NGUYEN, T. Q., SCHMID, I., OGBURN, E. L. and STUART, E. A. (2022). Clarifying Causal Mediation Analysis: Effect Identification via Three Assumptions and Five Potential Outcomes. Journal of Causal Inference 10 246-279. · doi:10.1515/jci-2021-0049
[28] NGUYEN, T. Q., SCHMID, I. and STUART, E. A. (2021). Clarifying causal mediation analysis for the applied researcher: Defining effects based on what we want to learn. Psychological Methods 26 255-271. · doi:10.1037/met0000299
[29] NOWOK, B., RAAB, G. M. and DIBBEN, C. (2016). synthpop: Bespoke Creation of Synthetic Data in R. Journal of Statistical Software 74 1-26. · doi:10.18637/jss.v074.i11
[30] PEARL, J. (2001). Direct and indirect effects. Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence 411-420.
[31] PEARL, J. (2012). The causal mediation formula-a guide to the assessment of pathways and mechanisms. Prevention Science 13 426-36. · doi:10.1007/s11121-011-0270-1
[32] QIN, X. and YANG, F. (2021). Simulation-based sensitivity analysis for causal mediation studies. Psychological Methods. · doi:10.1037/met0000340
[33] ROBINS, J., SUED, M., LEI-GOMEZ, Q. and ROTNITZKY, A. (2007). Comment: Performance of Double-Robust Estimators When “Inverse Probability” Weights Are Highly Variable. Statistical Science 22 544-559. · Zbl 1246.62076 · doi:10.1214/07-STS227D
[34] Robins, J. M. and Greenland, S. (1992). Identifiability and exchangeability for direct and indirect effects. Epidemiology 3 143-155.
[35] ROBINS, J. M., RICHARDSON, T. S. and SHPITSER, I. (2022). An Interventionist Approach to Mediation Analysis. In Probabilistic and Causal Inference: The Works of Judea Pearl, first ed. 36 713-764. Association for Computing Machinery, New York, NY, USA. · Zbl 07672268
[36] ROSENBAUM, P. R. and RUBIN, D. B. (1983). The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika 70 41. · Zbl 0522.62091 · doi:10.2307/2335942
[37] RUBIN, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 66 688-701. · doi:10.1037/h0037350
[38] STEEN, J., LOEYS, T., MOERKERKE, B. and VANSTEELANDT, S. (2017). Medflex: An R package for flexible mediation analysis using natural effect models. Journal of Statistical Software 76. · doi:10.18637/jss.v076.i11
[39] STEFANSKI, L. A. and BOOS, D. D. (2002). The calculus of M-estimation. The American Statistician 56 29-38. · doi:10.1198/000313002753631330
[40] STEINGRIMSSON, J. A., HANLEY, D. F. and ROSENBLUM, M. (2017). Improving precision by adjusting for prognostic baseline variables in randomized trials with binary outcomes, without regression model assumptions. Contemporary Clinical Trials 54 18-24. · doi:10.1016/j.cct.2016.12.026
[41] SZÉKELY, G. J. and RIZZO, M. L. (2013). Energy statistics: A class of statistics based on distances. Journal of Statistical Planning and Inference 143 1249-1272. · Zbl 1278.62072 · doi:10.1016/j.jspi.2013.03.018
[42] TCHETGEN TCHETGEN, E. J. (2013). Inverse odds ratio-weighted estimation for causal mediation analysis. Statistics in Medicine 32 4567-4580. · doi:10.1002/sim.5864
[43] TCHETGEN TCHETGEN, E. J. and SHPITSER, I. (2012). Semiparametric theory for causal mediation analysis: Efficiency bounds, multiple robustness and sensitivity analysis. The Annals of Statistics 40 1816-1845. · Zbl 1257.62033 · doi:10.1214/12-AOS990
[44] TCHETGEN TCHETGEN, E. J. and SHPITSER, I. (2014). Estimation of a semiparametric natural direct effect model incorporating baseline covariates. Biometrika 101 849-864. · Zbl 1306.62091 · doi:10.1093/biomet/asu044
[45] TINGLEY, D., YAMAMOTO, T., HIROSE, K., KEELE, L. and IMAI, K. (2014). mediation: R package for causal mediation analysis. Journal of Statistical Software 59 1-38. · doi:10.18637/jss.v059.i05
[46] VALERI, L. and VANDERWEELE, T. J. (2013). Mediation analysis allowing for exposure-mediator interactions and causal interpretation: Theoretical assumptions and implementation with SAS and SPSS macros. Psychological methods 18 137-150. · doi:10.1037/a0031034
[47] van Buuren, S. and Groothuis-Oudshoorn, K. (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software 45 1-67.
[48] VAN DER LAAN, M. J. and ROSE, S. (2011). Targeted Learning: Causal Inference for Observational and Experimental Data. Springer Series in Statistics. Springer New York.
[49] VANDERWEELE, T. J. and VANSTEELANDT, S. (2009). Conceptual issues concerning mediation, interventions and composition. Statistics and its Interface 2 457-468. · Zbl 1245.62177
[50] VANDERWEELE, T. J. and VANSTEELANDT, S. (2010). Odds ratios for mediation analysis for a dichotomous outcome. American Journal of Epidemiology 172 1339-1348. · doi:10.1093/aje/kwq332
[51] VANDERWEELE, T. J. and VANSTEELANDT, S. (2013). Mediation analysis with multiple mediators. Epidemiologic Methods 2 95-115. · Zbl 1359.92009 · doi:10.1515/em-2012-0010
[52] VANDERWEELE, T. J., VANSTEELANDT, S. and ROBINS, J. M. (2014). Effect decomposition in the presence of an exposure-induced mediator-outcome confounder. Epidemiology 25 300-6. · doi:10.1097/EDE.0000000000000034
[53] VANSTEELANDT, S., BEKAERT, M. and LANGE, T. (2012). Imputation strategies for the estimation of natural direct and indirect effects. Epidemiologic Methods 1 7. · Zbl 1462.62076 · doi:10.1515/2161-962X.1014
[54] VANSTEELANDT, S. and KEIDING, N. (2011). Invited commentary: G-computation-Lost in translation? American Journal of Epidemiology 173 739-742. · doi:10.1093/aje/kwq474
[55] WANG, B., OGBURN, E. L. and ROSENBLUM, M. (2019). Analysis of Covariance in Randomized Trials: More Precision, Less Conditional Bias, and Valid Confidence Intervals, Without Model Assumptions. Biometrics 75 1391-1400. · Zbl 1448.62115 · doi:10.1111/biom.13062
[56] WANG, L. and TCHETGEN TCHETGEN, E. (2018). Bounded, efficient and multiply robust estimation of average treatment effects using instrumental variables. Journal of the Royal Statistical Society. Series B: Statistical Methodology 80 531-550. · Zbl 1398.62348 · doi:10.1111/rssb.12262
[57] WANG, Z. and LOUIS, T. A. (2003). Matching conditional and marginal shapes in binary random intercept models using a bridge distribution function. Biometrika 90 765-775. · Zbl 1436.62294 · doi:10.1093/biomet/90.4.765
[58] WANG, Z. and LOUIS, T. A. (2004). Marginalized Binary Mixed-Effects Models with Covariate-Dependent Random Effects and Likelihood Inference. Biometrics 60 884-891. · Zbl 1274.62182 · doi:10.1111/j.0006-341X.2004.00243.x
[59] XU, L., GOTWALT, C., HONG, Y., KING, C. B. and MEEKER, W. Q. (2020). Applications of the Fractional-Random-Weight Bootstrap. American Statistician 1305 1-32. · Zbl 07593703 · doi:10.1080/00031305.2020.1731599
[60] ZHENG, W. and VAN DER LAAN, M. J. (2012). Targeted maximum likelihood estimation of natural direct effects. The International Journal of Biostatistics 8. · doi:10.2202/1557-4679.1361
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.