Document Zbl 07904039

Convergence of simulated annealing using kinetic Langevin dynamics. (English) Zbl 07904039

Electron. J. Probab. 29, Paper No. 77, 31 p. (2024).

Summary: We study the simulated annealing algorithm based on the kinetic Langevin dynamics, in order to find the global minimum of a non-convex potential function. For both the continuous time formulation and a discrete time analogue, we obtain the convergence rate results under technical conditions on the potential function, together with an appropriate choice of the cooling schedule and the time discretization parameters.

MSC:

35Q84	Fokker-Planck equations
35Q82	PDEs in connection with statistical mechanics
82C31	Stochastic methods (Fokker-Planck, Langevin, etc.) applied to problems in time-dependent statistical mechanics
65K10	Numerical optimization and variational techniques
90C59	Approximation methods and heuristics in mathematical programming
65C30	Numerical solutions to stochastic differential and integral equations
60H15	Stochastic partial differential equations (aspects of stochastic analysis)
65M12	Stability and convergence of numerical methods for initial value and initial-boundary value problems involving PDEs
65M06	Finite difference methods for initial value and initial-boundary value problems involving PDEs
35B40	Asymptotic behavior of solutions to PDEs

Keywords:

convergence rate; discretization; hypocoercivity; kinetic Langevin dynamics; simulated annealing

Cite Review PDF

Full Text: DOI arXiv Link

OA License

References:

[1]	A. Arnold, P. Markowich, G. Toscani, and A. Unterreiter. On convex Sobolev inequalities and the rate of convergence to equilibrium for Fokker-Planck type equations. Communications in Partial Differential Equations, 26(1-2):43-100, 2001. MathSciNet: MR1842428 · Zbl 0982.35113
[2]	D. Bakry, I. Gentil, and M. Ledoux. Analysis and Geometry of Markov Diffusion Operators, volume 348 of Grundlehren der mathematischen Wissenschaften. Springer Cham, 2014. MathSciNet: MR3155209 · Zbl 1376.60002
[3]	F. Baudoin, M. Gordina, and D. P. Herzog. Gamma calculus beyond Villani and explicit convergence estimates for Langevin dynamics with singular potentials. Archive for Rational Mechanics and Analysis, 241(2):765-804, 2021. MathSciNet: MR4275746 · Zbl 1478.60215
[4]	E. Bayraktar, Q. Feng, and W. Li. Exponential entropy dissipation for weakly self-consistent Vlasov-Fokker-Planck equations. arXiv:2204.12049, 2022. MathSciNet: MR4661252
[5]	E. Camrud, A. O. Durmus, P. Monmarché, and G. Stoltz. Second order quantitative bounds for unadjusted generalized Hamiltonian Monte Carlo. arXiv:2306.09513, 2023.
[6]	E. Camrud, D. P. Herzog, G. Stoltz, and M. Gordina. Weighted \(L^2\)-contractivity of Langevin dynamics with singular potentials. Nonlinearity, 35(2):998-1035, 2021. MathSciNet: MR4373993 · Zbl 1490.35496
[7]	M. Chak, N. Kantas, and G. A. Pavliotis. On the generalised Langevin equation for simulated annealing. arXiv:2003.06448, 2020. MathSciNet: MR4555159
[8]	X. Cheng, N. S. Chatterji, P. L. Bartlett, and M. I. Jordan. Underdamped Langevin MCMC: A non-asymptotic analysis. In Conference on Learning Theory, volume 75 of Proceedings of Machine Learning Research, pages 300-323, 2018.
[9]	T.-S. Chiang, C.-R. Hwang, and S. J. Sheu. Diffusion for global optimization in \(\mathbb{R}^n\). SIAM Journal on Control and Optimization, 25(3):737-753, 1987. MathSciNet: MR0885196 · Zbl 0622.60093
[10]	L. Desvillettes and C. Villani. On the trend to global equilibrium in spatially inhomogeneous entropy-dissipating systems: The linear Fokker-Planck equation. Communications on Pure and Applied Mathematics, 54(1):1-42, 2001. MathSciNet: MR1787105 · Zbl 1029.82032
[11]	G. Dujardin, F. Hérau, and P. Lafitte. Coercivity, hypocoercivity, exponential time decay and simulations for discrete Fokker-Planck equations. Numerische Mathematik, 144(3):615-697, 2020. MathSciNet: MR4071827 · Zbl 1435.35385
[12]	A. Eberle, A. Guillin, and R. Zimmer. Couplings and quantitative contraction rates for Langevin dynamics. Annals of Probability, 47(4):1982-2010, 2019. MathSciNet: MR3980913 · Zbl 1466.60160
[13]	N. Fournier and C. Tardif. On the simulated annealing in \(\mathbb{R}^d\). Journal of Functional Analysis, 281(5):109086, 2021. MathSciNet: MR4254553 · Zbl 1481.60155
[14]	X. Gao, M. Gürbüzbalaban, and L. Zhu. Global convergence of stochastic gradient Hamiltonian Monte Carlo for nonconvex stochastic optimization: Nonasymptotic performance bounds and momentum-based acceleration. Operations Research, forthcoming, 2021. MathSciNet: MR4531091
[15]	X. Gao, Z. Q. Xu, and X. Y. Zhou. State-dependent temperature control for Langevin diffusions. SIAM Journal on Control and Optimization, 60(3):1250-1268, 2022. MathSciNet: MR4421477 · Zbl 1493.93057
[16]	S. Geman and C.-R. Hwang. Diffusions for global optimization. SIAM Journal on Control and Optimization, 24(5):1031-1043, 1986. MathSciNet: MR0854068 · Zbl 0602.60071
[17]	A. Guillin, W. Liu, L. Wu, and C. Zhang. The kinetic Fokker-Planck equation with mean field interaction. Journal de Mathématiques Pures et Appliquées, 150:1-23, 2021. MathSciNet: MR4248461 · Zbl 1480.60235
[18]	F. Hérau. Hypocoercivity and exponential time decay for the linear inhomogeneous relaxation Boltzmann equation. Asymptotic Analysis, 46(3-4):349-359, 2006. MathSciNet: MR2215889 · Zbl 1096.35019
[19]	F. Hérau. Short and long time behavior of the Fokker-Planck equation in a confining potential and applications. Journal of Functional Analysis, 244(1):95-118, 2007. MathSciNet: MR2294477 · Zbl 1120.35016
[20]	R. A. Holley, S. Kusuoka, and D. W. Stroock. Asymptotics of the spectral gap with applications to the theory of simulated annealing. Journal of Functional Analysis, 83(2):333-347, 1989. MathSciNet: MR0995752 · Zbl 0706.58075
[21]	P. Jain, P. Kar, et al. Non-convex optimization for machine learning. Foundations and Trends® in Machine Learning, 10(3-4):142-363, 2017.
[22]	L. Journel and P. Monmarché. Convergence of the kinetic annealing for general potentials. arXiv:2107.11619, 2021. MathSciNet: MR4522371
[23]	S. Kirkpatrick, C. D. Gelatt Jr, and M. P. Vecchi. Optimization by simulated annealing. Science, 220(4598):671-680, 1983. MathSciNet: MR0702485 · Zbl 1225.90162
[24]	T. Lelièvre and G. Stoltz. Partial differential equations and stochastic methods in molecular dynamics. Acta Numerica, 25:681-880, 2016. MathSciNet: MR3509213 · Zbl 1348.82065
[25]	Y.-A. Ma, N. S. Chatterji, X. Cheng, N. Flammarion, P. L. Bartlett, and M. I. Jordan. Is there an analog of Nesterov acceleration for gradient-based MCMC? Bernoulli, 27(3):1942-1992, 2021. MathSciNet: MR4278799 · Zbl 1475.62123
[26]	J. C. Mattingly, A. M. Stuart, and D. J. Higham. Ergodicity for SDEs and approximations: Locally Lipschitz vector fields and degenerate noise. Stochastic Processes and their Applications, 101(2):185-232, 2002. MathSciNet: MR1931266 · Zbl 1075.60072
[27]	G. Menz and A. Schlichting. Poincaré and logarithmic Sobolev inequalities by decomposition of the energy landscape. Annals of Probability, 42(5):1809-1884, 2014. MathSciNet: MR3262493 · Zbl 1327.60156
[28]	G. Menz, A. Schlichting, W. Tang, and T. Wu. Ergodicity of the infinite swapping algorithm at low temperature. Stochastic Processes and their Applications, 151:519-552, 2022. MathSciNet: MR4447828 · Zbl 1494.37008
[29]	L. Miclo. Recuit simulé sur \(\mathbb{R}^n\). Étude de l’évolution de l’énergie libre. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques, 28(2):235-266, 1992. MathSciNet: MR1162574 · Zbl 0747.60071
[30]	P. Monmarché. Hypocoercivity in metastable settings and kinetic simulated annealing. Probability Theory and Related Fields, 172(3):1215-1248, 2018. MathSciNet: MR3877555 · Zbl 1404.60120
[31]	P. Monmarché. An entropic approach for Hamiltonian Monte Carlo: the idealized case. The Annals of Applied Probability, 34(2):2243-2293, 2024. MathSciNet: MR4728169 · Zbl 07899445
[32]	W. Mou, Y.-A. Ma, M. J. Wainwright, P. L. Bartlett, and M. I. Jordan. High-order Langevin diffusion yields an accelerated MCMC algorithm. Journal of Machine Learning Research, 22(42):1-41, 2021. MathSciNet: MR4253735 · Zbl 1539.65026
[33]	F. Otto and C. Villani. Generalization of an inequality by Talagrand and links with the logarithmic Sobolev inequality. Journal of Functional Analysis, 173(2):361-400, 2000. MathSciNet: MR1760620 · Zbl 0985.58019
[34]	G. A. Pavliotis. Stochastic processes and applications: diffusion processes, the Fokker-Planck and Langevin equations, volume 60 of Texts in Applied Mathematics. Springer New York, NY, 2014. MathSciNet: MR3288096 · Zbl 1318.60003
[35]	A. Porretta and E. Zuazua. Numerical hypocoercivity for the Kolmogorov equation. Mathematics of Computation, 86(303):97-119, 2017. MathSciNet: MR3557795 · Zbl 1351.65059
[36]	M. Raginsky, A. Rakhlin, and M. Telgarsky. Non-convex learning via stochastic gradient Langevin dynamics: A nonasymptotic analysis. In Conference on Learning Theory, pages 1674-1703. PMLR, 2017.
[37]	G. Royer. A remark on simulated annealing of diffusion processes. SIAM Journal on Control and Optimization, 27(6):1403-1408, 1989. MathSciNet: MR1022435 · Zbl 0689.60059
[38]	D. Talay. Stochastic Hamiltonian systems: exponential convergence to the invariant measure, and discretization by the implicit Euler scheme. Markov Processes and Related Fields, 8(2):163-198, 2002. MathSciNet: MR1924934 · Zbl 1011.60039
[39]	W. Tang and X. Y. Zhou. Simulated annealing from continuum to discretization: A convergence analysis via the Eyring-Kramers law. arXiv:2102.02339, 2021.
[40]	C. Villani. Hypocoercivity, volume 202(950). Memoirs of the American Mathematical Society, 2009. MathSciNet: MR2562709
[41]	P.-A. Zitt. Annealing diffusions in a potential function with a slow growth. Stochastic Processes and their Applications, 118(1):76-119, 2008. MathSciNet: MR2376253 · Zbl 1144.60048
[42]	D. Zou, P. Xu, and Q. Gu. Stochastic gradient Hamiltonian Monte Carlo methods with recursive variance reduction. Advances in Neural Information Processing Systems, 32, 2019.

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.