Skip to main content
Log in

Testing equality of a large number of densities under mixing conditions

  • Original Paper
  • Published:
TEST Aims and scope Submit manuscript

Abstract

In certain settings, such as microarray data, the sampling information is formed by a large number of possibly dependent small data sets. In special applications, for example in order to perform clustering, the researcher aims to verify whether all data sets have a common distribution. For this reason we propose a formal test for the null hypothesis that all data sets come from a single distribution. The asymptotic setting is that in which the number of small data sets goes to infinity, while the sample size remains fixed. The asymptotic null distribution of the proposed test is derived under mixing conditions on the sequence of small data sets, and the power properties of our test under two reasonable fixed alternatives are investigated. A simulation study is conducted, showing that the test respects the nominal level, and that it has a power which tends to 1 when the number of data sets tends to infinity. An illustration involving microarray data is provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Bücher A, Kojadinovic I (2016a) A dependent multiplier bootstrap for the sequential empirical copula process under strong mixing. Bernoulli 22:927–968

    Article  MathSciNet  Google Scholar 

  • Bücher A, Kojadinovic I (2016b) Dependent multiplier bootstrap for non-degenerate \(U\)-statistics under mixing conditions with applications. J Stat Plan Inference 170:83–105

    Article  MathSciNet  Google Scholar 

  • Bühlmann P (1993) The blockwise bootstrap in time series and empirical processes (Ph.D. thesis), ETH Zürich, Diss. ETH No. 10354

  • Cousido-Rocha M, de Uña-Álvarez J, Hart J (2018) Equalden.HD: testing the equality of a high dimensional set of densities. R package version 1.0. CRAN package repository: https://cran.r-project.org/web/packages/Equalden.HD/index.html

  • Dehling H, Wendler M (2010) Central limit theorem and the bootstrap for \(U\)-statistics of strongly mixing data. J Multivar Anal 101:126–137

    Article  MathSciNet  Google Scholar 

  • Dehling H, Fried R, Garcia I, Wendler M (2015) Change-point detection under dependence based on two-sample \(U\)-statistics. Asymptotic laws and method in stochastics, a volume in Honour of Miklos Csrg, pp 195–220

  • Dey-Rao R, Sinha AA (2017) Genome-wide gene expression dataset used to identify potential therapeutic targets in androgenetic alopecia. Data Brief 13:85–87

    Article  Google Scholar 

  • Doukhan P (1995) Mixing: properties and examples. Springer, New York

    MATH  Google Scholar 

  • Fan J, Yao Q (2003) Non linear time series: nonparametric and parametric methods. Springer, New York

    Book  Google Scholar 

  • Hahn M (2006) Proceedings of the SMBE Tri-National Young Investigators’ Workshop 2005. Accurate inference and estimation in population genomics. Mol Biol Evol 23:911–8

    Article  Google Scholar 

  • Hedenfalk I, Duggan D, Chen Y, Radmacher M, Bittner M, Simon R, Meltzer P, Gusterson B, Esteller M, Kallioniemi O, Wilfond B, Borg A, Trent J, Raffeld M, Yakhini Z, BenDor A, Dougherty E, Kononen J, Bubendorf L, Fehrle W, Pittaluga S, Gruvberger G, Loman N, Johannsson O, Olsson H, Sauter G (2001) Gene-expression profiles in hereditary breast cancer. N Engl J Med 344(8):539–548

    Article  Google Scholar 

  • Koren A, Tirosh I, Barkai N (2007) Autocorrelation analysis reveals widespread spatial biases in microarray experiments. BMC Genomics 8:164

    Article  Google Scholar 

  • Künsch HR (1989) The jackknife and the bootstrap for general stationary observations. Ann Stat 17(3):1217–1241

    Article  MathSciNet  Google Scholar 

  • Liu RY, Singh K (1992) Moving blocks jackknife and bootstrap capture weak dependence. In: Lepage R, Billard L (eds) Exploring the limits of bootstrap. Wiley, New York

    Google Scholar 

  • Marmer V (2016) Lecture notes on econometric theory II: Lecture 7, adapted from Peter Phillips’ lecture notes on stationarity and NSTS, 1995, and H. White, 1999, asymptotic theory for econometricians, Academic Press. UBC Vancouver School of Economics, Econ627. http://faculty.arts.ubc.ca/vmarmer/econ627/627_07_2.pdf

  • Neumann MH, Paparoditis E (2000) On bootstrapping \(L_2\)-type statistics in density testing. Stat Probab Lett 50:137–147

    Article  Google Scholar 

  • Priestley MB (1981) Spectral analysis and time series. Academic Press, New York

    MATH  Google Scholar 

  • Politis DN (2002) Adaptive bandwidth choice. https://pdfs.semanticscholar.org/c8d5/4df33343c6550HrB85f867e82a1861e9d510dcd.pdfHrB. Accessed 13 Feb 2017

  • Politis DN, Romano JP (1994) Bias-corrected nonparametric spectral estimation II. Technical Report #94-5

  • Quessy JF, Éthier F (2012) Cramér–von Mises and characteristic function tests for the two and \(k\)-sample problems with dependent data. Comput Stat Data Anal 56:2097–2111

    Article  Google Scholar 

  • van der Vaart AW, Wellner JA (2000) Weak convergence and empirical processes, 2nd edn. Springer, New York

    MATH  Google Scholar 

  • Zhan D, Hart J (2014) Testing equality of a large number of densities. Biometrika 101:449–464

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work has received financial support of the Call 2015 Grants for Ph.D. contracts for training of doctors of the Ministry of Economy and Competitiveness, cofinanced by the European Social Fund (Ref. BES-2015-074958). We acknowledge support from MTM2014-55966-P project, Ministry of Economy and Competitiveness, and MTM2017-89422-P project, Ministry of Economy, Industry and Competitiveness, State Research Agency, and Regional Development Fund, UE. We also acknowledge the financial support provided by the SiDOR research group through the grant Competitive Reference Group, 2016–2019 (ED431C 2016/040), funded by the “Consellería de Cultura, Educación e Ordenación Universitaria. Xunta de Galicia.” To finish, the first author would like to thank the University of Vigo, and its Escola Internacional de Doutoramento (EIDO) by the financial support provided through mobility doctorate grants. The authors also thank Professors Raymond J. Carroll and Robert Chapkin for allowing use of their data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marta Cousido-Rocha.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Materials:

Supplementary Material includes formal definitions of mixing dependence, stationarity and regularity conditions needed for the technical results, a remark about Theorem 5, the proof of Theorem 6, an additional real data analysis, and additional simulation results. (pdf 394KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cousido-Rocha, M., de Uña-Álvarez, J. & Hart, J.D. Testing equality of a large number of densities under mixing conditions. TEST 28, 1203–1228 (2019). https://doi.org/10.1007/s11749-018-00625-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11749-018-00625-3

Keywords

Mathematics Subject Classification

Navigation