Document Zbl 07910825

Robert, Christian P.; Elvira, Víctor; Tawn, Nick; Wu, Changye

Accelerating MCMC algorithms. (English) Zbl 07910825

Wiley Interdiscip. Rev., WIREs Comput. Stat. 10, No. 5, Article ID e1435, 14 p. (2018).

MSC:

62-08

Computational methods for problems pertaining to statistics

Keywords:

Bayesian analysis; computational statistics; convergence of algorithms; efficiency of algorithms; Hamiltonian Monte Carlo; Monte Carlo methods; Rao-Blackwellisation; simulation; tempering

Software:

Stan

Cite Review PDF

Full Text: DOI arXiv

OA License

References:

[1]	Andrieu, C., & Doucet, A. (2002). Particle filtering for partially observed Gaussian state space models. Journal of Royal Statistical Society Series B, 64, 827-836. · Zbl 1067.62098
[2]	Andrieu, C., Doucet, A., & Holenstein, R. (2011). Particle Markov chain Monte Carlo (with discussion). Journal of Royal Statistical Society Series B, 72(2), 269-342. · Zbl 1411.65020
[3]	Andrieu, C., & Roberts, G. (2009). The pseudo‐marginal approach for efficient Monte Carlo computations. The Annals of Statistics, 37, 697-725. · Zbl 1185.60083
[4]	Angelino, E., Kohler, E., Waterland, A., Seltzer, M., & Adams, R. (2014). Accelerating MCMC via parallel predictive prefetching. arXiv preprint arXiv:1403.7265.
[5]	Aslett, L., Esperança, P., & Holmes, C. (2015). A review of homomorphic encryption and software tools for encrypted statistical machine learning. arXiv preprint arXiv:1508.06574.
[6]	Atchadé, Y. F., & Liu, J. S. (2004). The Wang‐Landau algorithm for Monte Carlo computation in general state spaces. Statistica Sinica, 20, 209-233. · Zbl 1181.62022
[7]	Atchadé, Y. F., Roberts, G., & Rosenthal, J. (2011). Towards optimal scaling of Metropolis‐coupled Markov chain Monte Carlo. Statistics and Computing, 21, 555-568. · Zbl 1223.65001
[8]	Banterle, M., Grazian, C., Lee, A., & Robert, C. P. (2015). Accelerating Metropolis-Hastings algorithms by delayed acceptance. arXiv preprint arXiv:1503.00996.
[9]	Bardenet, R., Doucet, A., & Holmes, C. (2014). Towards scaling up Markov chain Monte Carlo: An adaptive subsampling approach. Paper presented at International Conference on Machine Learning (ICML), 405-413.
[10]	Bardenet, R., Doucet, A., & Holmes, C. (2015). On Markov chain Monte Carlo methods for tall data. arXiv preprint arXiv:1505.02827.
[11]	Bédard, M., Douc, R., & Moulines, E. (2012). Scaling analysis of multiple‐try MCMC methods. Stochastic Processes and their Applications, 122, 758-786. · Zbl 1239.60075
[12]	Betancourt, M. (2017). A conceptual introduction to Hamiltonian Monte Carlo. ArXiv e‐prints: 1701.02434.
[13]	Bhatnagar, N., & Randall, D. (2016). Simulated tempering and swapping on mean‐field models. Journal of Statistical Physics, 164, 495-530. · Zbl 1348.82042
[14]	Bierkens, J. (2016). Non‐reversible Metropolis‐Hastings. Statistics and Computing, 26, 1213-1228. · Zbl 1360.65040
[15]	Bierkens, J., Bouchard‐Côté, A., Doucet, A., Duncan, A. B., Fearnhead, P., Roberts, G., & Vollmer, S. J. (2017). Piecewise deterministic Markov processes for scalable Monte Carlo on restricted domains. arXiv preprint arXiv:1701.04244.
[16]	Bierkens, J., Fearnhead, P., & Roberts, G. (2016). The zig‐zag process and super‐efficient sampling for Bayesian analysis of big data. arXiv preprint arXiv:1607.03188.
[17]	Bou‐Rabee, N., Sanz‐Serna, J. M., et al. (2017). Randomized Hamiltonian Monte Carlo. The Annals of Applied Probability, 27, 2159-2194. · Zbl 1373.60129
[18]	Bouchard‐Côté, A., Vollmer, S. J., & Doucet, A. (2017). The bouncy particle sampler: A non‐reversible rejection‐free Markov chain Monte Carlo method. Journal of the American Statistical Association, To appear.
[19]	Calderhead, B. (2014). A general construction for parallelizing Metropolis-Hastings algorithms. Proceedings of the National Academy of Sciences, 111, 17408-17413.
[20]	Cappé, O., Douc, R., Guillin, A., Marin, J.‐M., & Robert, C. (2008). Adaptive importance sampling in general mixture classes. Statistics and Computing, 18, 447-459.
[21]	Cappé, O., Guillin, A., Marin, J.‐M., & Robert, C. (2004). Population Monte Carlo. Journal of Computational and Graphical Statistics, 13, 907-929.
[22]	Cappé, O., & Robert, C. (2000). Ten years and still running!Journal of American Statistical Association, 95, 1282-1286. · Zbl 1072.60506
[23]	Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., … Riddell, A. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76.
[24]	Carter, J., & White, D. (2013). History matching on the Imperial College fault model using parallel tempering. Computational Geosciences, 17, 43-65.
[25]	Casella, G., & Robert, C. (1996). Rao‐Blackwellization of sampling schemes. Biometrika, 83, 81-94. · Zbl 0866.62024
[26]	Chen, T., Fox, E., & Guestrin, C. (2014). Stochastic gradient Hamiltonian Monte Carlo. In Proceedings of the International Conference on Machine Learning, ICML’2014 (pp. 1683-1691).
[27]	Chen, T., & Hwang, C. (2013). Accelerating reversible Markov chains. Statistics & Probability Letters, 83, 1956-1962. · Zbl 1285.60076
[28]	Davis, M. H. (1984). Piecewise‐deterministic Markov processes: A general class of non‐diffusion stochastic models. Journal of the Royal Statistical Society: Series B: Methodological, 353-388. · Zbl 0565.60070
[29]	Davis, M. H. (1993). Markov models & optimization (Vol. 49). CRC Press. · Zbl 0780.60002
[30]	Del Moral, P., Doucet, A., & Jasra, A. (2006). Sequential Monte Carlo samplers. Journal of Royal Statistical Society Series B, 68, 411-436. · Zbl 1105.62034
[31]	Deligiannidis, G., Doucet, A., & Pitt, M. K. (2015). The correlated pseudo‐marginal method. arXiv preprint arXiv:1511.04992.
[32]	Ding, N., Fang, Y., Babbush, R., Chen, C., Skeel, R. D., & Neven, H. (2014). Bayesian sampling using stochastic gradient thermostats. Proceedings of the 27th International Conference on Neural Information Processing Systems ‐ Volume 2, NIPS 2015 (pp. 3203-3211).
[33]	Douc, R., Guillin, A., Marin, J.‐M., & Robert, C. (2007). Convergence of adaptive mixtures of importance sampling schemes. Annals of Statistics, 35(1), 420-448. · Zbl 1132.60022
[34]	Douc, R., & Robert, C. (2010). A vanilla variance importance sampling via population Monte Carlo. Annals of Statistics, 39(1), 261-277.
[35]	Doucet, A., Godsill, S., & Andrieu, C. (2000). On sequential Monte‐Carlo sampling methods for Bayesian filtering. Statistics and Computing, 10, 197-208.
[36]	Duane, S., Kennedy, A. D., Pendleton, B. J., & Roweth, D. (1987). Hybrid Monte Carlo. Physics Letters B, 195, 216-222.
[37]	Durmus, A., & Moulines, E. (2017). Nonasymptotic convergence analysis for the unadjusted Langevin algorithm. Annals of Applied Probability, 27, 1551-1587. · Zbl 1377.65007
[38]	Earl, D. J., & Deem, M. W. (2005). Parallel tempering: Theory, applications, and new perspectives. Physical Chemistry Chemical Physics, 7, 3910-3916.
[39]	Fielding, M., Nott, D. J., & Liong, S.‐Y. (2011). Efficient MCMC schemes for computationally expensive posterior distributions. Technometrics, 53, 16-28.
[40]	Gelfand, A., & Smith, A. (1990). Sampling based approaches to calculating marginal densities. Journal of the American Statistical Association, 85, 398-409. · Zbl 0702.62020
[41]	Gelman, A., Gilks, W., & Roberts, G. (1996). Efficient Metropolis jumping rules. In J.Berger (ed.), J.Bernardo (ed.), A.Dawid (ed.), D.Lindley (ed.), & A.Smith (ed.) (Eds.), Bayesian statistics 5 (pp. 599-608). Oxford, England: Oxford University Press.
[42]	Geyer, C. J. (1991). Markov chain Monte Carlo maximum likelihood. Computing Science and Statistics, 23, 156-163.
[43]	Girolami, M., & Calderhead, B. (2011). Riemann manifold Langevin and Hamiltonian Monte Carlo methods. Journal of the Royal Statistical Society, Series B: Statistical Methodology, 73, 123-214. · Zbl 1411.62071
[44]	Guihenneuc‐Jouyaux, C., & Robert, C. P. (1998). Discretization of continuous Markov chains and Markov chain Monte Carlo convergence assessment. Journal of the American Statistical Association, 93, 1055-1067. · Zbl 1013.60055
[45]	Haario, H., Saksman, E., & Tamminen, J. (1999). Adaptive proposal distribution for random walk Metropolis algorithm. Computational Statistics, 14(3), 375-395. · Zbl 0941.62036
[46]	Hoffman, M. D., & Gelman, A. (2014). The No‐U‐turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning and Research, 15, 1593-1623. · Zbl 1319.60150
[47]	Hwang, C.‐R., Hwang‐Ma, S.‐Y., & Sheu, S.‐J. (1993). Accelerating gaussian diffusions. The Annals of Applied Probability, 3, 897-913. · Zbl 0780.60074
[48]	Iba, Y. (2000). Population‐based Monte Carlo algorithms. Transactions of the Japanese Society for Artificial Intelligence, 16, 279-286.
[49]	Jacob, P. E., O’Leary, J., & Atchadé, Y. F. (2017). Unbiased Markov chain Monte Carlo with couplings. ArXiv e‐prints. 1708.03625.
[50]	Jacob, P., Robert, C. P., & Smith, M. H. (2011). Using parallel computation to improve independent Metropolis-Hastings based estimation. Journal of Computational and Graphical Statistics, 20, 616-635.
[51]	Lehmann, E., & Casella, G. (1998). Theory of point estimation (revised ed.). New York, NY: Springer‐Verlag. · Zbl 0916.62017
[52]	Liang, F., Liu, C., & Carroll, R. (2007). Stochastic approximation in Monte Carlo computation. Journal of the American Statistical Association, 102, 305-320. · Zbl 1226.65002
[53]	Liu, J., Wong, W., & Kong, A. (1994). Covariance structure of the Gibbs sampler with application to the comparison of estimators and augmentation schemes. Biometrika, 81, 27-40. · Zbl 0811.62080
[54]	Liu, J., Wong, W., & Kong, A. (1995). Covariance structure and convergence rates of the Gibbs sampler with various scans. Journal of Royal Statistical Society Series B, 57, 157-169. · Zbl 0811.60056
[55]	Liu, J. S., Liang, F., & Wong, W. H. (2000). The multiple‐try method and local optimization in Metropolis sampling. Journal of the American Statistical Association, 95, 121-134. · Zbl 1072.65505
[56]	Livingstone, S., Faulkner, M. F., & Roberts, G. O. (2017). Kinetic energy choice in Hamiltonian/hybrid Monte Carlo. arXiv preprint arXiv:1706.02649. · Zbl 1439.60070
[57]	MacKay, D. J. C. (2002). Information theory, inference & learning algorithms. Cambridge, England: Cambridge University Press.
[58]	Marinari, E., & Parisi, G. (1992). Simulated tempering: A new Monte Carlo scheme. EPL (Europhysics Letters), 19, 451-458.
[59]	Martino, L., Elvira, V., Luengo, D., Corander, J., & Louzada, F. (2016). Orthogonal parallel MCMC methods for sampling and optimization. Digital Signal Processing, 58, 64-84.
[60]	Martino, L. (2018). A Review of Multiple Try MCMC algorithms for Signal Processing. ArXiv e‐prints. 1801.09065.
[61]	Mengersen, K., & Robert, C. (2003). Iid sampling with self‐avoiding particle filters: The pinball sampler. In J.Bernardo (ed.), M.Bayarri (ed.), J.Berger (ed.), A.Dawid (ed.), D.Heckerman (ed.), A.Smith (ed.), & M.West (ed.) (Eds.), Bayesian statistics (Vol. 7). Oxford, England: Oxford University Press. · Zbl 1044.62002
[62]	Meyn, S., & Tweedie, R. (1993). Markov chains and stochastic stability. New York, NY: Springer‐Verlag. · Zbl 0925.60001
[63]	Miasojedow, B., Moulines, E., & Vihola, M. (2013). An adaptive parallel tempering algorithm. Journal of Computational and Graphical Statistics, 22, 649-664.
[64]	Minsker, S., Srivastava, S., Lin, L., & Dunson, D. B. (2014). Scalable and robust Bayesian inference via the median posterior. In Proceedings of the 31st International Conference on International Conference on Machine Learning ‐ Volume 32 (pp. 1656-1664). ICML’14, JMLR.org.
[65]	Mira, A. (2001). On Metropolis‐Hastings algorithms with delayed rejection. Metron, 59(3-4), 231-241. · Zbl 0998.65502
[66]	Mira, A., & Sargent, D. J. (2003). A new strategy for speeding Markov chain Monte Carlo algorithms. Statistical Methods and Applications, 12, 49-60. · Zbl 1064.65003
[67]	Mohamed, L., Calderhead, B., Filippone, M., Christie, M., & Girolami, M. (2012). Population MCMC methods for history matching and uncertainty quantification. Computational Geosciences, 16, 423-436.
[68]	Mykland, P., Tierney, L., & Yu, B. (1995). Regeneration in Markov chain samplers. Journal of the American Statistical Association, 90, 233-241. · Zbl 0819.62082
[69]	Neal, R. M. (1996). Sampling from multimodal distributions using tempered transitions. Statistics and Computing, 6, 353-366.
[70]	Neal, R. (1999). Bayesian learning for neural networks (Vol. 118). New York, NY: Springer Verlag Lecture notes.
[71]	Neal, R. (2011). MCMC using Hamiltonian dynamics. In S.Brooks (ed.), A.Gelman (ed.), G. L.Jones (ed.), & X.‐L.Meng (ed.) (Eds.), Handbook of Markov Chain Monte Carlo (pp. 113-162). New York, NY: CRC Press. · Zbl 1229.65018
[72]	Neiswanger, W., Wang, C., & Xing, E. (2013). Asymptotically exact, embarrassingly parallel MCMC. arXiv preprint arXiv:1311.4780.
[73]	Quiroz, M., Villani, M., & Kohn, R. (2016). Exact subsampling MCMC. arXiv preprint arXiv:1603.08232.
[74]	Rasmussen, C. E. (2003). Gaussian processes to speed up hybrid Monte Carlo for expensive Bayesian integrals. In J.Bernardo (ed.), M.Bayarri (ed.), J.Berger (ed.), A.Dawid (ed.), D.Heckerman (ed.), A.Smith (ed.), & M.West (ed.) (Eds.), Bayesian Statistics (Vol. 7, pp. 651-659). Oxford, England: Oxford University Press. · Zbl 1044.62002
[75]	Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). Cambridge, MA: The MIT Press.
[76]	Rhee, C.‐H., & Glynn, P. W. (2015). Unbiased estimation with square root convergence for sde models. Operations Research, 63, 1026-1043. · Zbl 1347.65016
[77]	Robert, C., & Casella, G. (2004). Monte Carlo statistical methods (2nd ed.). New York, NY: Springer‐Verlag. · Zbl 1096.62003
[78]	Robert, C., & Casella, G. (2009). Introducing Monte Carlo methods with R. New York: Springer‐Verlag.
[79]	Roberts, G., Gelman, A., & Gilks, W. R. (1997). Weak convergence and optimal scaling of random walk Metropolis algorithms. The Annals of Applied Probability, 7, 110-120. · Zbl 0876.60015
[80]	Roberts, G., & Rosenthal, J. (2001). Optimal scaling for various Metropolis‐Hastings algorithms. Statistical Science, 16, 351-367. · Zbl 1127.65305
[81]	Roberts, G., & Rosenthal, J. (2007). Coupling and ergodicity of adaptive Markov chain Monte Carlo algorithms. Journal of Applied Probability, 44(2), 458-475. · Zbl 1137.62015
[82]	Roberts, G., & Rosenthal, J. (2009). Examples of adaptive MCMC. Journal of Computational and Graphical Statistics, 18, 349-367.
[83]	Roberts, G., & Rosenthal, J. (2014). Minimising MCMC variance via diffusion limits, with an application to simulated tempering. The Annals of Applied Probability, 24, 131-149. · Zbl 1298.60078
[84]	Rubinstein, R. Y. (1981). Simulation and the Monte Carlo method. New York, NY: John Wiley. · Zbl 0529.68076
[85]	Saksman, E., & Vihola, M. (2010). On the ergodicity of the adaptive Metropolis algorithm on unbounded domains. The Annals of Applied Probability, 20(6), 2178-2203. · Zbl 1209.65004
[86]	Scott, S. L., Blocker, A. W., Bonassi, F. V., Chipman, H. A., George, E. I., & McCulloch, R. E. (2016). Bayes and big data: The consensus Monte Carlo algorithm. International Journal of Management Science and Engineering Management, 11, 78-88.
[87]	Srivastava, S., Cevher, V., Dinh, Q., & Dunson, D. (2015). WASP: Scalable Bayes via barycenters of subset posteriors. In G.Lebanon (ed.) and S. V. N.Vishwanathan (ed.) (Eds.), Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, (pp. 912-920) vol. 38 of Proceedings of Machine Learning Research. PMLR, San Diego, California, USA.
[88]	Storvik, G. (2002). Particle filters for state space models with the presence of static parameters. IEEE Transactions on Signal Processing, 50, 281-289.
[89]	Sun, Y., Gomez, F., & Schmidhuber, J. (2010). Improving the asymptotic performance of Markov chain Monte Carlo by inserting vortices. In Proceedings of the 23rd International Conference on Neural Information Processing Systems ‐ Volume 2 (pp. 2235-2243). NIPS’10, Curran Associates Inc., USA.
[90]	Terenin, A., Simpson, D., & Draper, D. (2015). Asynchronous Gibbs Sampling. ArXiv e‐prints. 1509.08999.
[91]	Tierney, L., & Mira, A. (1998). Some adaptive Monte Carlo methods for Bayesian inference. Statistics in Medicine, 18, 2507-2515.
[92]	Tjelmeland, H. (2004). Using all Metropolis‐Hastings proposals to estimate mean values. (Technical Report 4). Norwegian University of Science and Technology, Trondheim, Norway.
[93]	Wang, F., & Landau, D. (2001). Determining the density of states for classical statistical models: A random walk algorithm to produce a flat histogram. Physical Review E, 64, 056101.
[94]	Wang, X. & Dunson, D. (2013). Parallelizing MCMC via weierstrass sampler. arXiv preprint arXiv:1312.4605.
[95]	Wang, X., Guo, F., Heller, K., & Dunson, D. (2015). Parallelizing MCMC with random partition trees. Advances in Neural Information Processing Systems, 451-459.
[96]	Welling, M. & Teh, Y. (2011). Bayesian learning via stochastic gradient Langevin dynamics. In Proceedings of the 28th International Conference on International Conference on Machine Learning. ICML’11 (pp. 681-688), USA: Omnipress.
[97]	Woodard, D. B., Schmidler, S. C., & Huber, M. (2009a). Conditions for rapid mixing of parallel and simulated tempering on multimodal distributions. The Annals of Applied Probability, 19, 617-640. · Zbl 1171.65008
[98]	Woodard, D. B., Schmidler, S. C., & Huber, M. (2009b). Sufficient conditions for torpid mixing of parallel and simulated tempering. Electronic Journal of Probability, 14, 780-804. · Zbl 1189.65021
[99]	Xie, Y., Zhou, J., & Jiang, S. (2010). Parallel tempering Monte Carlo simulations of lysozyme orientation on charged surfaces. The Journal of Chemical Physics, 132. 02B602.

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.