×

Experimental design in marketplaces. (English) Zbl 07792876

Summary: Classical Randomized Controlled Trials (RCTs), or A/B tests, are designed to draw causal inferences about a population of units, for example, individuals, plots of land or visits to a website. A key assumption underlying a standard RCT is the absence of interactions between units, or the stable unit treatment value assumption [D. B. Rubin, Ann. Stat. 6, 34–58 (1978; Zbl 0383.62021)]. Modern experimentation, however, is often conducted in settings characterized by complex interactions between units. Such interactions can invalidate the standard estimators and make classical experimental designs ineffective. Although the presence of interference forces us to make untestable assumptions on the nature of the interactions even under randomization, sophisticated experimental designs can ameliorate the dependence on such assumptions. In this manuscript, we review the recent and rapidly growing literature on novel experimental designs for these settings. One key feature common to many of these designs is the presence of multiple layers of randomization within the same experiment. We discuss a novel experimental design, called Multiple Randomization Designs or MRDs, that provides a general framework for such experiments. Through these complex designs, we can study questions about causal effects in the presence of interference that cannot be answered by classical RCTs.

MSC:

62-XX Statistics

Citations:

Zbl 0383.62021

References:

[1] Aronow, P. M. (2012). A general method for detecting interference between units in randomized experiments. Sociol. Methods Res. 41 3-16. Digital Object Identifier: 10.1177/0049124112437535 Google Scholar: Lookup Link MathSciNet: MR3190698 · doi:10.1177/0049124112437535
[2] Aronow, P. M. and Samii, C. (2017). Estimating average causal effects under general interference, with application to a social network experiment. Ann. Appl. Stat. 11 1912-1947. Digital Object Identifier: 10.1214/16-AOAS1005 Google Scholar: Lookup Link MathSciNet: MR3743283 · Zbl 1383.62329 · doi:10.1214/16-AOAS1005
[3] Athey, S., Eckles, D. and Imbens, G. W. (2018). Exact \(p\)-values for network interference. J. Amer. Statist. Assoc. 113 230-240. Digital Object Identifier: 10.1080/01621459.2016.1241178 Google Scholar: Lookup Link MathSciNet: MR3803460 · Zbl 1398.62140 · doi:10.1080/01621459.2016.1241178
[4] ATHEY, S. and IMBENS, G. W. (2022). Design-based analysis in difference-in-differences settings with staggered adoption. J. Econometrics 226 62-79. Digital Object Identifier: 10.1016/j.jeconom.2020.10.012 Google Scholar: Lookup Link MathSciNet: MR4348786 · Zbl 07471885 · doi:10.1016/j.jeconom.2020.10.012
[5] BACKSTROM, L. and KLEINBERG, J. (2011). Network bucket testing. In Proceedings of the 20th International Conference on World Wide Web 615-624.
[6] BAJARI, P., BURDICK, B., IMBENS, G. W., MASOERO, L., MCQUEEN, J., RICHARDSON, T. and ROSEN, I. M. (2021). Multiple randomization designs. arXiv preprint. Available at arXiv:2112.13495.
[7] Basse, G. W., Feller, A. and Toulis, P. (2019). Randomization tests of causal effects under interference. Biometrika 106 487-494. Digital Object Identifier: 10.1093/biomet/asy072 Google Scholar: Lookup Link MathSciNet: MR3949317 · Zbl 1434.62094 · doi:10.1093/biomet/asy072
[8] BOJINOV, I., SIMCHI-LEVI, D. and ZHAO, J. (2020). Design and analysis of switchback experiments. Available at SSRN 3684168.
[9] BOND, R. M., FARISS, C. J., JONES, J. J., KRAMER, A. D., MARLOW, C., SETTLE, J. E. and FOWLER, J. H. (2012). A 61-million-person experiment in social influence and political mobilization. Nature 489 295-298.
[10] BRANDT, A. (1938). Tests of significance in reversal or switchback trials. Iowa Agric. Home Econ. Exp. Stat. Res. Bull. 21 1.
[11] BROWN, B. W. JR. (1980). The crossover experiment for clinical trials. Biometrics 69-79. · Zbl 0426.62076
[12] COCHRAN, W. (1939). Long-term agricultural experiments. Suppl. J. R. Stat. Soc. 6 104-148.
[13] COCHRAN, W. G. (1977). Sampling Techniques, 3rd ed. Wiley Series in Probability and Mathematical Statistics. Wiley, New York. MathSciNet: MR0474575 · Zbl 0353.62011
[14] COCHRAN, W. G. and COX, G. M. (1948). Experimental Designs. Wiley, New York, NY.
[15] COOK, T. D. and DEMETS, D. L. (2007). Introduction to Statistical Methods for Clinical Trials. CRC Press/CRC, Boca Raton, FL.
[16] CRÉPON, B., DUFLO, E., GURGAND, M., RATHELOT, R. and ZAMORA, P. (2013). Do labor market policies have displacement effects? Evidence from a clustered randomized experiment. Q. J. Econ. 128 531-580.
[17] FISHER, R. A. (1937). The Design of Experiments. Oliver & Boyd, Edinburgh. · JFM 61.0566.03
[18] GART, J. J. (1963). A median test with sequential application. Biometrika 50 55-62. Digital Object Identifier: 10.1093/biomet/50.1-2.55 Google Scholar: Lookup Link MathSciNet: MR0156424 · Zbl 0114.10303 · doi:10.1093/biomet/50.1-2.55
[19] GASTWIRTH, J. L. (1968). The first-median test: A two-sided version of the control median test. J. Amer. Statist. Assoc. 63 692-706. MathSciNet: MR0240933 · Zbl 0162.21805
[20] GUPTA, S., KOHAVI, R., TANG, D., XU, Y., ANDERSEN, R., BAKSHY, E., CARDIN, N., CHANDRAN, S., CHEN, N. et al. (2019). Top challenges from the first practical online controlled experiments summit. ACM SIGKDD Explor. Newsl. 21 20-35.
[21] HALLORAN, M. E., and STRUCHINER, C. J. (1991). Study Designs for Dependent Happenings. Epidemiology. 2 331-338.
[22] HECKMAN, J. J., LOCHNER, L. and TABER, C. (1998). General-equilibrium treatment effects: A study of tuition policy. Amer. Econ. Rev. 88 381-386.
[23] HEMMING, K., HAINES, T. P., CHILTON, P. J., GIRLING, A. J. and LILFORD, R. J. (2015). The stepped wedge cluster randomised trial: Rationale, design, analysis, and reporting. BMJ 350.
[24] Holland, P. W. (1986). Statistics and causal inference. J. Amer. Statist. Assoc. 81 945-970. MathSciNet: MR0867618 · Zbl 0607.62001
[25] HOLTZ, D., LOBEL, R., LISKOVICH, I. and ARAL, S. (2020). Reducing interference bias in online marketplace pricing experiments. arXiv preprint. Available at arXiv:2004.12489.
[26] Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. J. Amer. Statist. Assoc. 47 663-685. MathSciNet: MR0053460 MathSciNet: MR53460 · Zbl 0047.38301
[27] Hudgens, M. G. and Halloran, M. E. (2008). Toward causal inference with interference. J. Amer. Statist. Assoc. 103 832-842. Digital Object Identifier: 10.1198/016214508000000292 Google Scholar: Lookup Link MathSciNet: MR2435472 · Zbl 1471.62507 · doi:10.1198/016214508000000292
[28] IMAI, K., JIANG, Z. and MALANI, A. (2021). Causal inference with interference and noncompliance in two-stage randomized experiments. J. Amer. Statist. Assoc. 116 632-644. Digital Object Identifier: 10.1080/01621459.2020.1775612 Google Scholar: Lookup Link MathSciNet: MR4270009 · Zbl 1464.62208 · doi:10.1080/01621459.2020.1775612
[29] Imbens, G. W. and Rubin, D. B. (2015). Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge Univ. Press, Cambridge. · Zbl 1355.62002
[30] JOHARI, R., LI, H. and WEINTRAUB, G. (2020). Experimental design in two-sided platforms: An analysis of bias. arXiv preprint. Available at arXiv:2002.05670.
[31] JONES, B. and NACHTSHEIM, C. J. (2009). Split-plot designs: What, why, and how. J. Qual. Technol. 41 340-361.
[32] KOHAVI, R., CROOK, T., LONGBOTHAM, R., FRASCA, B., HENNE, R., FERRES, J. L. and MELAMED, T. (2009). Online experimentation at Microsoft. Data Mining Case Stud. 11.
[33] Manski, C. F. (1993). Identification of endogenous social effects: The reflection problem. Rev. Econ. Stud. 60 531-542. Digital Object Identifier: 10.2307/2298123 Google Scholar: Lookup Link MathSciNet: MR1236836 · Zbl 0800.90377 · doi:10.2307/2298123
[34] MATHISEN, H. C. (1943). A method of testing the hypothesis that two samples are from the same population. Ann. Math. Stat. 14 188-194. Digital Object Identifier: 10.1214/aoms/1177731460 Google Scholar: Lookup Link MathSciNet: MR0009285 · Zbl 0060.30402 · doi:10.1214/aoms/1177731460
[35] MUNRO, E., WAGER, S. and XU, K. (2021). Treatment effects in market equilibrium. arXiv preprint. Available at arXiv:2109.11647.
[36] NEYMAN, J. (1923/1990). On the application of probability theory to agricultural experiments. Essay on principles. Section 9. Statist. Sci. 5 465-472. · Zbl 0955.01560
[37] OGBURN, E. L. and VANDERWEELE, T. J. (2014). Causal diagrams for interference. Statist. Sci. 29 559-578. Digital Object Identifier: 10.1214/14-STS501 Google Scholar: Lookup Link MathSciNet: MR3300359 · Zbl 1331.62200 · doi:10.1214/14-STS501
[38] PAPADOGEORGOU, G., MEALLI, F. and ZIGLER, C. M. (2019). Causal inference with interfering units for cluster and population level treatment allocation programs. Biometrics 75 778-787. Digital Object Identifier: 10.1111/biom.13049 Google Scholar: Lookup Link MathSciNet: MR4012083 · Zbl 1436.62676 · doi:10.1111/biom.13049
[39] POLLMANN, M. (2020). Causal inference for spatial treatments. arXiv preprint, arXiv:2011.00373.
[40] POUGET-ABADIE, J., AYDIN, K., SCHUDY, W., BRODERSEN, K. and MIRROKNI, V. (2019). Variance reduction in bipartite experiments through correlation clustering. Adv. Neural Inf. Process. Syst. 32.
[41] Rosenbaum, P. R. (2007). Interference between units in randomized experiments. J. Amer. Statist. Assoc. 102 191-200. Digital Object Identifier: 10.1198/016214506000001112 Google Scholar: Lookup Link MathSciNet: MR2345537 · Zbl 1284.62494 · doi:10.1198/016214506000001112
[42] Rubin, D. B. (1978). Bayesian inference for causal effects: The role of randomization. Ann. Statist. 6 34-58. MathSciNet: MR0472152 · Zbl 0383.62021
[43] UGANDER, J., KARRER, B., BACKSTROM, L. and KLEINBERG, J. (2013). Graph cluster randomization: Network exposure to multiple universes. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’13 329-337. Association for Computing Machinery, New York, NY, USA.
[44] VANDERWEELE, T. J., TCHETGEN, E. J. T. and HALLORAN, M. E. (2014). Interference and sensitivity analysis. Statist. Sci. 29 687-706. Digital Object Identifier: 10.1214/14-STS479 Google Scholar: Lookup Link MathSciNet: MR3300366 · Zbl 1331.62443 · doi:10.1214/14-STS479
[45] WAGER, S. and XU, K. (2021). Experimenting in equilibrium. Manage. Sci. https://doi.org/10.1287/mnsc.2020.3844.
[46] WU, C. J. and HAMADA, M. S. (2011). Experiments: Planning, Analysis, and Optimization 552. Wiley, New York.
[47] XIONG, R., ATHEY, S., BAYATI, M. and IMBENS, G. W. (2019). Optimal experimental design for staggered rollouts. http://dx.doi.org/10.2139/ssrn.3483934.
[48] YATES, F. (1935). Complex experiments. Suppl. J. R. Stat. Soc. 2 181-247.
[49] ZIGLER, C. M. and PAPADOGEORGOU, G. (2021). Bipartite causal inference with interference. Statist. Sci. 36 109-123. Digital Object Identifier: 10.1214/19-STS749 Google Scholar: Lookup Link MathSciNet: MR4194206 · Zbl 07368222 · doi:10.1214/19-STS749
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.