×

Approximating the distributions of runs and patterns. (English) Zbl 1329.62064

Summary: The distribution theory of runs and patterns has been successfully used in a variety of applications including, for example, nonparametric hypothesis testing, reliability theory, quality control, DNA sequence analysis, general applied probability and computer science. The exact distributions of the number of runs and patterns are often very hard to obtain or computationally problematic, especially when the pattern is complex and \(n\) is very large. Normal, Poisson and compound Poisson approximations are frequently used to approximate these distributions. In this manuscript, we (i) study the asymptotic relative error of the normal, Poisson, compound Poisson and finite Markov chain imbedding and large deviation approximations; and (ii) provide some numerical studies to comparing these approximations with the exact probabilities for moderately sized \(n\). Both theoretical and numerical results show that, in the relative sense, the finite Markov chain imbedding approximation performs the best in the left tail and the large deviation approximation performs best in the right tail.

MSC:

62E10 Characterization and structure theory of statistical distributions
60E05 Probability distributions: general theory
60J10 Markov chains (discrete-time Markov processes on discrete state spaces)

References:

[1] Arratia R, Goldstein L, Gordon L: Poisson approximation and the Chen-Stein method.Stat. Sci 1990,5(4):403-434. · Zbl 0955.62542
[2] Asmussen S: Applied Probability and Queues. Springer, New York; 2003. · Zbl 1029.60001
[3] Balakrishnan N, Koutras MV: Runs and Scans with Applications. Wiley Series in Probability and Statistics. Wiley-Interscience [John Wiley & Sons], New York; 2002. · Zbl 0991.62087
[4] Barbour AD, Eagleson GK: Poisson approximation for some statistics based on exchangeable trials.Adv. Appl. Probab 1983,15(3):585-600. · Zbl 0511.60025 · doi:10.2307/1426620
[5] Barbour AD, Eagleson GK: Poisson convergence for dissociated statistics.J. Roy. Statist. Soc. Ser. B 1984,46(3):397-402. · Zbl 0567.62062
[6] Barbour AD, Eagleson GK: An improved Poisson limit theorem for sums of dissociated random variables.J. Appl. Probab 1987,24(3):586-599. · Zbl 0629.60026 · doi:10.2307/3214091
[7] Barbour AD, Hall P: On the rate of Poisson convergence.Math. Proc. Cambridge Philos. Soc 1984,95(3):473-480. · Zbl 0544.60029 · doi:10.1017/S0305004100061806
[8] Barbour AD, Chryssaphinou O: Compound Poisson approximation: a user’s guide.Ann. Appl. Probab 2001,11(3):964-1002. · Zbl 1018.60051 · doi:10.1214/aoap/1015345355
[9] Barbour AD, Holst L, Janson S: Poisson Approximation. Oxford Studies in Probability. 199a.
[10] Barbour AD, Chen LHY, Loh W-L: Compound Poisson approximation for nonnegative random variables via Stein’s method.Ann. Probab 1992b,20(4):1843-1866. · Zbl 0765.60015 · doi:10.1214/aop/1176989531
[11] Barbour AD, Chryssaphinou O, Roos M: Compound Poisson approximation in reliability theory.IEEE T. Reliab 1995,44(3):398-402. · doi:10.1109/24.406572
[12] Barbour AD, Chryssaphinou O, Roos M: Compound Poisson approximation in systems reliability.Naval Res. Logist 1996,43(2):251-264. · Zbl 0871.90034 · doi:10.1002/(SICI)1520-6750(199603)43:2<251::AID-NAV6>3.0.CO;2-9
[13] Blom G, Thorburn D: How many random digits are required until given sequences are obtained?J. Appl. Probab 1982,19(3):518-531. · Zbl 0493.60071 · doi:10.2307/3213511
[14] Chen LHY: Poisson approximation for dependent trials.Ann. Probab 1975,3(3):534-545. · Zbl 0335.60016 · doi:10.1214/aop/1176996359
[15] Cui L, Xu Y, Zhao X: Developments and applications of the finite Markov chain imbedding approach in reliability.IEEE T. Reliab 2010,59(4):685-690. · doi:10.1109/TR.2010.2054172
[16] Fu JC: Reliability of a large consecutive-k-out-of-n:F system.IEEE T. Reliab 1985, R-34:120-127. · Zbl 0564.62083 · doi:10.1109/TR.1985.5221970
[17] Fu JC, Johnson BC: Approximate probabilities for runs and patterns in i.i.d. and Markov dependent multi-state trials.Adv. Appl. Probab 2009,41(1):292-308. · Zbl 1166.60008 · doi:10.1239/aap/1240319586
[18] Fu JC, Koutras MV: Distribution theory of runs: a Markov chain approach.J. Amer. Statist. Assoc 1994,89(427):1050-1058. · Zbl 0806.60011 · doi:10.1080/01621459.1994.10476841
[19] Fu JC, Lou WYW: Distribution Theory of Runs and Patterns and Its Applications. World Scientific Publishing Co. Inc, River Edge; 2003. · Zbl 1030.60063 · doi:10.1142/4669
[20] Fu JC, Lou WYW: On the normal approximation for the distribution of the number of simple or compound patterns in a random sequence of multi-state trials.Methodol. Comput. Appl. Probab 2007,9(2):195-205. · Zbl 1122.60023 · doi:10.1007/s11009-007-9019-5
[21] Fu JC, Johnson BC, Chang Y-M: Approximating the extreme right-hand tail probability for the distribution of the number of patterns in a sequence of multi-state trials.J. Stat. Plan. Infer 2012,142(2):473-480. · Zbl 1428.60030 · doi:10.1016/j.jspi.2011.08.005
[22] Gerber HU, Li S-YR: The occurrence of sequence patterns in repeated experiments and hitting times in a Markov chain.Stochastic Process. Appl 1981,11(1):101-108. · Zbl 0449.60050 · doi:10.1016/0304-4149(81)90025-9
[23] Godbole AP: Degenerate and Poisson convergence criteria for success runs.Statist. Probab. Lett 1990a,10(3):247-255. · Zbl 0717.60035 · doi:10.1016/0167-7152(90)90082-I
[24] Godbole AP: Specific formulae for some success run distributions.Statist. Probab. Lett 1990b,10(2):119-124. · Zbl 0706.60008 · doi:10.1016/0167-7152(90)90006-S
[25] Godbole AP: Poisson approximations for runs and patterns of rare events.Adv. Appl. Probab 1991,23(4):851-865. · Zbl 0751.60018 · doi:10.2307/1427680
[26] Godbole AP, Schaffner AA: Improved Poisson approximations for word patterns.Adv. Appl. Probab 1993,25(2):334-347. · Zbl 0772.60013 · doi:10.2307/1427656
[27] Holst L, Kennedy JE, Quine MP: Rates of Poisson convergence for some coverage and urn problems using coupling.J. Appl. Probab 1988,25(4):717-724. · Zbl 0667.60013 · doi:10.2307/3214292
[28] Karlin S: Statistical signals in bioinformatics.Proc. Natl. Acad. Sci. U. S. A 2005,102(38):13355-13362. · doi:10.1073/pnas.0501804102
[29] Karlin S, Taylor HM: A First Course in Stochastic Processes. Academic Press [A subsidiary of Harcourt Brace Jovanovich, Publishers], New York-London; 1975. · Zbl 0315.60016
[30] Kleffe J, Borodovski M: First and second moment of counts of words in random text generated by Markov chains.Comp Applic Biosci 1992, 8:443-441.
[31] Martin, J.; Regad, L.; Camproux, A-C; Nuel, G.; Skiadas, CH (ed.), Finite Markov chain embedding for the exact distribution of patterns in a set of random sequences (2010), Boston
[32] Meyn SP, Tweedie RL: Markov Chains and Stochastic Stability. Communications and Control Engineering Series. 1993. · Zbl 0925.60001 · doi:10.1007/978-1-4471-3267-7
[33] Nuel G, Regad L, Martin J, Camproux A-C: Exact distribution of a pattern in a set of random sequences generated by a Markov source: applications to biological data.Algorithm Mol. Biol 2010,5(1):1-18. · doi:10.1186/1748-7188-5-15
[34] Schwager SJ: Run probabilities in sequences of Markov-dependent trials.J. Amer. Statist. Assoc 1983,78(381):168-180. · Zbl 0509.60069 · doi:10.1080/01621459.1983.10477947
[35] Seneta E: Non-negative Matrices and Markov Chains. Springer, New York; 1983.
[36] Solov’ev AD: A combinatorial identity and its application to the problem on the first occurrence of a rare event.Teor. Verojatnost. i Primenen 1966, 11:313-320. · Zbl 0154.43104
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.