×

Testing the equality of a large number of populations. (English) Zbl 1484.62051

Summary: Given \(k\) independent samples with finite but arbitrary dimension, this paper deals with the problem of testing for the equality of their distributions that can be continuous, discrete or mixed. In contrast to the classical setting where \(k\) is assumed to be fixed and the sample size from each population increases without bound, here \(k\) is assumed to be large and the size of each sample is either bounded or small in comparison with \(k\). The asymptotic distribution of two test statistics is stated under the null hypothesis of the equality of the \(k\) distributions as well as under alternatives, which let us to study the asymptotic power of the resulting tests. Specifically, it is shown that both test statistics are asymptotically free distributed under the null hypothesis. The finite sample performance of the tests based on the asymptotic null distribution is studied via simulation. An application of the proposal to a real data set is included. The use of the proposed procedure for infinite dimensional data, as well as other possible extensions, are discussed.

MSC:

62G10 Nonparametric hypothesis testing
62G20 Asymptotic properties of nonparametric inference
Full Text: DOI

References:

[1] Agrawal, A.; Catalini, C.; Goldfarb, A., Some simple economics of crowdfunding, Innov Policy Econ, 14, 63-97 (2014) · doi:10.1086/674021
[2] Alba-Fernández, MV; Jiménez-Gamero, MD; Muñoz-García, J., A test for the two-sample problem based on empirical characteristic functions, Comput Stat Data Anal, 52, 3730-3748 (2008) · Zbl 1452.62305 · doi:10.1016/j.csda.2007.12.013
[3] Alba-Fernández, MV; Batsidis, A.; Jiménez-Gamero, MD; Jodrá, P., A class of tests for the two-sample problem for count data, J Comput Appl Math, 318, 220-229 (2017) · Zbl 1359.62136 · doi:10.1016/j.cam.2016.09.050
[4] Anderson, NH; Hall, P.; Titterington, DM, Two-sample tests for measuring discrepancies between two multivariate probability density functions using kernel-based density estimates, J Multivar Anal, 50, 41-54 (1994) · Zbl 0798.62055 · doi:10.1006/jmva.1994.1033
[5] Bárcenas R, Ortega J, Quiroz AJ (2017) Quadratic forms of the empirical processes for the two-sample problem for functional data. Test 26:503-526 · Zbl 1373.62188
[6] Baringhaus, L.; Franz, C., On a new multivariate two-sample test, J Multivar Anal, 88, 190-206 (2004) · Zbl 1035.62052 · doi:10.1016/S0047-259X(03)00079-4
[7] Baringhaus, L.; Kolbe, D., two-sample tests based on empirical Hankel trasforms, Stat Pap, 56, 597-617 (2015) · Zbl 1317.62037 · doi:10.1007/s00362-014-0599-1
[8] Cousido-Rocha, M.; de Uña-Álvarez, J.; Hart, JD, Testing equality of a large number of densities under mixing conditions, Test, 28, 1203-1228 (2019) · Zbl 1439.62110 · doi:10.1007/s11749-018-00625-3
[9] Cuesta-Albertos, JA; Fraiman, R.; Ransford, T., Random projections and goodness-of-fit tests in infinite-dimensional spaces, Bull Braz Math Soc, 37, 1-25 (2006) · Zbl 1098.14048 · doi:10.1007/s00574-006-0001-6
[10] Hall, P.; Van Keilegom, I., Two-sample tests in functional data analysis starting from discrete data, Stat Sin, 17, 1511-1531 (2007) · Zbl 1136.62035
[11] Henze, N.; Jiménez-Gamero, MD, A test for Gaussianity in Hilbert spaces via the empirical characteristic functional, Scand J Stat (2020) · Zbl 1469.62428 · doi:10.1111/sjos.12470
[12] Hušková, M.; Meintanis, SG, Tests for the multivariate \(k\)-sample problem based on the empirical characteristic function, J Nonparametr Stat, 20, 263-277 (2008) · Zbl 1216.62067 · doi:10.1080/10485250801948294
[13] Jammalamadaka, SR; Jiménez-Gamero, MD; Meintanis, SG, A class of goodness-of-fit tests for circular distributions based on trigonometric moments, SORT, 43, 317-336 (2019) · Zbl 1428.62248
[14] Jiang, Q.; Hušková, M.; Meintanis, SG; Zhu, L., Asymptotics, finite-sample comparisons and applications for two-sample tests with functional data, J Multivar Anal, 170, 202-220 (2019) · Zbl 1415.62037 · doi:10.1016/j.jmva.2018.09.002
[15] Jiménez-Gamero, MD; Alba-Fernández, MV; Jodrá, P.; Barranco-Chamorro, I., Fast tests for the two-sample problem based on the empirical characteristic function, Math Comput Simul, 137, 390-410 (2017) · Zbl 1540.62059 · doi:10.1016/j.matcom.2016.09.007
[16] Kiefer, J., \(k\)-sample analogues of the Kolmogorov-Smirnov and Cramer-V, Mises Tests Ann Math Stat, 30, 420-447 (1959) · Zbl 0134.36707 · doi:10.1214/aoms/1177706261
[17] Laha, RG; Rohatgi, VK, Probability theory (1979), New York: Wiley, New York · Zbl 0409.60001
[18] Martínez-Camblor, P.; de Uña-Álvarez, J., Non-parametric k-sample tests: density functions vs distribution functions, Comput Stat Data Anal, 53, 3344-57 (2009) · Zbl 1453.62152 · doi:10.1016/j.csda.2009.02.009
[19] Mollick, E., The dynamics of crowdfunding: an exploratory study, J Bus Ventur, 29, 1-16 (2014) · doi:10.1016/j.jbusvent.2013.06.005
[20] Pardo-Fernández, JC; Jiménez-Gamero, MD, Testing for the conditional variance in nonparametric regression models, AStA Adv Stat Anal, 103, 387-410 (2019) · Zbl 1427.62031 · doi:10.1007/s10182-018-00336-y
[21] Pardo-Fernández, JC; Jiménez-Gamero, MD; El Ghouch, A., A nonparametric ANOVA-type test for regression curves based on characteristic functions, Scand J Stat, 42, 197-213 (2015) · Zbl 1369.62079 · doi:10.1111/sjos.12102
[22] Rivas-Martínez, GI; Jiménez-Gamero, MD; Moreno Rebollo, JL, A two-sample test for the error distribution in nonparametric regression, Stat Pap, 60, 1369-1395 (2019) · Zbl 1432.62107 · doi:10.1007/s00362-017-0878-8
[23] Zhan, D.; Hart, JD, Testing equality of a large number of densities, Biometrika, 101, 449-464 (2014) · Zbl 1452.62565 · doi:10.1093/biomet/asu002
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.