Abstract
This paper proposes an accelerated proximal point method for maximally monotone operators. The proof is computer-assisted via the performance estimation problem approach. The proximal point method includes various well-known convex optimization methods, such as the proximal method of multipliers and the alternating direction method of multipliers, and thus the proposed acceleration has wide applications. Numerical experiments are presented to demonstrate the accelerating behaviors.
Similar content being viewed by others
Notes
The convergence of the fixed-point residual does not guarantee the convergence of the sequence of the iterates \(\{{\varvec{x}} _i\}\). We leave analyzing the convergence of the sequence as future work, possibly based on the convergence analysis in [10] for Nesterov’s fast gradient method [51, 52] and FISTA [8] in convex minimization.
A convex–concave function \(\phi \) satisfies \(\phi ({\varvec{u}} _*,{\varvec{v}} _N) \ge \phi ({\varvec{u}} _N,{\varvec{v}} _N) + \mathop {\langle {\varvec{u}} _* - {\varvec{u}} _N,\, {\varvec{q}} _{{\varvec{u}},N} \rangle }\nolimits \) for \({\varvec{q}} _{{\varvec{u}},N} \in \partial _{{\varvec{u}}}\phi ({\varvec{u}} _N,{\varvec{v}} _N)\) and \(-\phi ({\varvec{u}} _N,{\varvec{v}} _*) \ge -\phi ({\varvec{u}} _N,{\varvec{v}} _N) + \mathop {\langle {\varvec{v}} _* - {\varvec{v}} _N,\, -{\varvec{q}} _{{\varvec{v}},N} \rangle }\nolimits \) for \(-{\varvec{q}} _{{\varvec{v}},N} \in \partial _{{\varvec{v}}} (-\phi ({\varvec{u}} _N,{\varvec{v}} _N))\). Adding these two inequalities yields \(\mathop {\langle {\varvec{x}} _N - {\varvec{x}} _*,\, {\varvec{q}} _N \rangle }\nolimits \ge \phi ({\varvec{u}} _N,{\varvec{v}} _*) - \phi ({\varvec{u}} _*,{\varvec{v}} _N)\), where \({\varvec{x}} _N {:}{=} ({\varvec{u}} _N,{\varvec{v}} _N)\) and \({\varvec{q}} _N {:}{=} ({\varvec{q}} _{{\varvec{u}},N},-{\varvec{q}} _{{\varvec{v}},N})\).
We found that adaptively restarting the method when the fixed-point residual increases seems to be a good option in practice.
Since \(\nabla f\) is Lipschitz continuous, the operator \({\varvec{M}} _1 = -{\varvec{D}} \partial f^*(-{\varvec{D}} ^\top \cdot )\) in (53) for the problem (62) is strongly monotone, but this is insufficient to guarantee a strong monotonicity of \({\varvec{M}} _{\rho ,{\varvec{M}} _1,{\varvec{M}} _2}\) (46) for the problem (62).
References
Alvarez, F., Attouch, H.: An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Anal. 9(1–2), 3–11 (2001). https://doi.org/10.1023/A:1011253113155
Attouch, H., Cabot, A.: Convergence of a relaxed inertial forward-backward algorithm for structured monotone inclusions. Appl. Math. Optim. 80(3), 547–598 (2019). https://doi.org/10.1007/s00245-019-09584-z
Attouch, H., Cabot, A.: Convergence of a relaxed inertial proximal algorithm for maximally monotone operators. Math. Program. 184(1–2), 243–287 (2020). https://doi.org/10.1007/s10107-019-01412-0
Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: First-order optimization algorithms via inertial systems with Hessian driven damping. Math. Program. (2020). https://doi.org/10.1007/s10107-020-01591-1
Attouch, H., Chbani, Z., Riahi, H.: Fast proximal methods via time scaling of damped inertial dynamics. SIAM J. Optim. 29(3), 2227–2256 (2019). https://doi.org/10.1137/18M1230207
Attouch, H., Peypouquet, J.: Convergence of inertial dynamics and proximal algorithms governed by maximally monotone operators. Math. Program. 174(1–2), 391–432 (2019). https://doi.org/10.1007/s10107-018-1252-x
Bauschke, H.H., Combettes, P.L.: Convex analysis and monotone operator theory in Hilbert spaces. Springer, Berlin (2011). https://doi.org/10.1007/978-1-4419-9467-7
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009). https://doi.org/10.1137/080716542
Brezis, H., Lions, P.L.: Produits infinis de resolvantes. Isr. J. Math. 29(4), 329–345 (1978). https://doi.org/10.1007/BF02761171
Chambolle, A., Dossal, C.: On the convergence of the iterates of the “Fast iterative shrinkage/thresholding algorithm”. J. Optim. Theory Appl. 166(3), 968–82 (2015). https://doi.org/10.1007/s10957-015-0746-4
Chambolle, A., Pock, T.: A first-order primal–dual algorithm for convex problems with applications to imaging. J. Math. Imaging Vis. 40(1), 120–145 (2011). https://doi.org/10.1007/s10851-010-0251-1
Chambolle, A., Pock, T.: An introduction to continuous optimization for imaging. Acta Numer. 25, 161–319 (2016). https://doi.org/10.1017/S096249291600009X
Chambolle, A., Pock, T.: On the ergodic convergence rates of a first-order primal-dual algorithm. Math. Program. 159(1), 253–87 (2016). https://doi.org/10.1007/s10107-015-0957-3
Combettes, P.L.: Monotone operator theory in convex optimization. Math. Program. 170(1), 177–206 (2018)
Corman, E., Yuan, X.: A generalized proximal point algorithm and its convergence rate. SIAM J. Optim. 24(4), 1614–38 (2014). https://doi.org/10.1137/130940402
CVX Research Inc.: CVX: Matlab software for disciplined convex programming, version 2.0. http://cvxr.com/cvx (2012)
Davis, D., Yin, W.: Convergence rate analysis of several splitting schemes. In: Glowinski, R., Osher, S., Yin, W. (eds.) Splitting Methods in Communication, Imaging, Science, and Engineering. Springer, Berlin (2016)
Davis, D., Yin, W.: Faster convergence rates of relaxed Peaceman-Rachford and ADMM under regularity assumptions. Math. Oper. Res. 42(3), 783–805 (2017). https://doi.org/10.1287/moor.2016.0827
Douglas, J., Rachford, H.H.: On the numerical solution of heat conduction problems in two and three space variables. Trans. Am. Math. Soc. 82(2), 421–39 (1956)
Drori, Y., Taylor, A.B.: Efficient first-order methods for convex minimization: a constructive approach. Math. Program. 184(1–2), 183���220 (2020). https://doi.org/10.1007/s10107-019-01410-2
Drori, Y., Teboulle, M.: Performance of first-order methods for smooth convex minimization: a novel approach. Math. Program. 145(1–2), 451–82 (2014). https://doi.org/10.1007/s10107-013-0653-0
Drori, Y., Teboulle, M.: An optimal variant of Kelley’s cutting-plane method. Math. Program. 160(1), 321–51 (2016). https://doi.org/10.1007/s10107-016-0985-7
Eckstein, J.: The Lions–Mercier splitting algorithm and the alternating direction method are instances of the proximal point method (1988). Technical report LIDS-P-1769
Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55(1–3), 293–318 (1992). https://doi.org/10.1007/BF01581204
Esser, E., Zhang, X., Chan, T.: A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science. SIAM J. Imaging Sci. 3(4), 1015–46 (2010). https://doi.org/10.1137/09076934X
Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite-element approximations. Comput. Math. Appl. 2(1), 17–40 (1976). https://doi.org/10.1016/0898-1221(76)90003-1
Glowinski, R., Marrocco, A.: Sur lapproximation par elements nis dordre un, et la resolution par penalisation-dualite dune classe de problemes de dirichlet nonlineaires, rev. francaise daut. Inf. Rech. Oper. R–2, 41–76 (1975)
Goldstein, T., O’Donoghue, B., Setzer, S., Baraniuk, R.: Fast alternating direction optimization methods. SIAM J. Imaging Sci. 7(3), 1588–623 (2014). https://doi.org/10.1137/120896219
Gol’shtein, E.G., Tret’yakov, N.V.: Modified Lagrangians in convex programming and their generalizations. In: Huard, P. (ed.) Point-to-Set Maps and Mathematical Programming, Mathematical Programming Studies 10. Springer, Berlin (1979)
Grant, M., Boyd, S.: Graph implementations for nonsmooth convex programs. In: Blondel, V., Boyd, S., Kimura, H. (eds.) Recent Advances in Learning and Control, Lecture Notes in Control and Information Sciences, pp. 95–110. Springer, Berlin (2008)
Gu, G., Yang, J.: On the optimal ergodic sublinear convergence rate of the relaxed proximal point algorithm for variational inequalities (2019). arXiv:1905.06030
Gu, G., Yang, J.: Tight sublinear convergence rate of the proximal point algorithm for maximal monotone inclusion problems. SIAM J. Optim. 30(3), 1905–1921 (2020). https://doi.org/10.1137/19M1299049
Güler, O.: On the convergence of the proximal point algorithm for convex minimization. SIAM J. Control Optim. 29(2), 403–19 (1991). https://doi.org/10.1137/0329022
Güler, O.: New proximal point algorithms for convex minimization. SIAM J. Optim. 2(4), 649–64 (1992). https://doi.org/10.1137/0802032
He, B., Yuan, X.: Convergence analysis of primal-dual algorithms for a saddle-point problem: from contraction perspective. SIAM J. Imaging Sci. 5(1), 119–49 (2012). https://doi.org/10.1137/100814494
Hestenes, M.R.: Multiplier and gradient methods. J. Optim. Theory Appl. 4(5), 303–20 (1969). https://doi.org/10.1007/BF00927673
Kim, D., Fessler, J.A.: Optimized first-order methods for smooth convex minimization. Math. Program. 159(1), 81–107 (2016). https://doi.org/10.1007/s10107-015-0949-3
Kim, D., Fessler, J.A.: Another look at the Fast Iterative Shrinkage/Thresholding Algorithm (FISTA). SIAM J. Optim. 28(1), 223–50 (2018). https://doi.org/10.1137/16M108940X
Kim, D., Fessler, J.A.: Generalizing the optimized gradient method for smooth convex minimization. SIAM J. Optim. 28(2), 1920–50 (2018). https://doi.org/10.1137/17m112124x
Kim, D., Fessler, J.A.: Optimizing the efficiency of first-order methods for decreasing the gradient of smooth convex functions. J. Optim. Theory Appl. (2020). https://doi.org/10.1007/s10957-020-01770-2
Lieder, F.: On the convergence rate of the Halpern-iteration. Optim. Lett. 15, 405–18 (2020). https://doi.org/10.1007/s11590-020-01617-9
Lieder, F.: Projection based methods for conic linear programming—optimal first order complexities and norm constrained quasi newton methods (Doctoral dissertation, Universitäts-und Landesbibliothek der Heinrich-Heine-Universität Düsseldorf) (2018). https://docserv.uniduesseldorf.de/servlets/DerivateServlet/Derivate-49971/Dissertation.pdf
Lin, H., Mairal, J., Harchaoui, Z.: Catalyst acceleration for first-order convex optimization: from theory to practice. J. Mach. Learn. Res. 18(212), 1–54 (2018)
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–79 (1979). https://doi.org/10.1137/0716071
Lorenz, D., Pock, T.: An inertial forward–backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51(2), 311–25 (2015). https://doi.org/10.1007/s10851-014-0523-2
Martinet, B.: Régularisation d’inéquations variationnelles par approximations successives. Rev. Française Informat. Recherche Opérationnelle 4, 154–8 (1970)
Minty, G.J.: Monotone (nonlinear) operators in Hilbert space. Duke Math. J. 29(3), 341–6 (1962). https://doi.org/10.1215/S0012-7094-62-02933-2
Minty, G.J.: On the monotonicity of the gradient of a convex function. Pac. J. Math. 14, 243��7 (1964)
Moreau, J.J.: Proximité et dualité dans un espace hilbertien. Bulletin de la Société Mathématique de France 93, 273–99 (1965)
Nemirovski, A.: Efficient methods in convex programming (1994). http://www2.isye.gatech.edu/~nemirovs/Lect_EMCO.pdf. visited on 05/2019
Nesterov, Y.: A method for unconstrained convex minimization problem with the rate of convergence \(O(1/k^2)\). Dokl. Akad. Nauk. USSR 269(3), 543–7 (1983)
Nesterov, Y.: On an approach to the construction of optimal methods of minimization of smooth convex functions. Ekonomika i Mateaticheskie Metody 24, 509–517 (1988). in Russian
Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–61 (2013). https://doi.org/10.1007/s10107-012-0629-5
O’Donoghue, B., Candes, E.: Adaptive restart for accelerated gradient schemes. Found. Comput. Math. 15(3), 715–32 (2015). https://doi.org/10.1007/s10208-013-9150-3
Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comput. Math. Math. Phys. 4(5), 1–17 (1964). https://doi.org/10.1016/0041-5553(64)90137-5
Powell, M.J.D.: A method for nonlinear constraints in minimization problems. In: Fletcher, R. (ed.) Optimization, pp. 283–298. Academic Press, New York (1969)
Rockafellar, R.T.: Monotone operators associated with saddle functions and minimax problems. In: Browder, F.E. (ed.) Nonlinear Functional Analysis, Part 1, vol. 18, pp. 397–407. American Mathematical Society, New York (1970)
Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1(2), 97–116 (1976). https://doi.org/10.1287/moor.1.2.97
Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5), 877–98 (1976). https://doi.org/10.1137/0314056
Ryu, E.K., Boyd, S.: A primer on monotone operator methods. Appl. Comput. Math. 15(1), 3–43 (2016)
Ryu, E.K., Taylor, A.B., Bergeling, C., Giselsson, P.: Operator splitting performance estimation: tight contraction factors and optimal parameter selection. SIAM J. Optim. 30(3), 2251–2271 (2020). https://doi.org/10.1137/19M1304854
Ryu, E.K., Yin, W.: Large-scale convex optimization via monotone operators (2020). https://large-scale-book.mathopt.com/LSCOMO.pdf. visited on 03/2021
Shi, B., Du, S.S., Jordan, M.I., Su, W.J.: Understanding the acceleration phenomenon via high-resolution differential equations (2018). arXiv:1810.08907
Su, W., Boyd, S., Candes, E.J.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17(153), 1–43 (2016)
Taylor, A.B., Bach, F.: Stochastic first-order methods: non-asymptotic and computer-aided analyses via potential functions. In: Proceedings of the Conference on Learning Theory, pp. 2934–2992 (2019)
Taylor, A.B., Hendrickx, J.M., Glineur, F.: Exact worst-case performance of first-order methods for composite convex optimization. SIAM J. Optim. 27(3), 1283–313 (2017). https://doi.org/10.1137/16m108104x
Taylor, A.B., Hendrickx, J.M., Glineur, F.: Smooth strongly convex interpolation and exact worst-case performance of first-order methods. Math. Program. 161(1), 307–45 (2017). https://doi.org/10.1007/s10107-016-1009-3
Acknowledgements
The author sincerely appreciates the useful comments by the associate editor and anonymous referees. The author also would like to thank Dr. Felix Lieder, who brought to attention his Ph.D. thesis [42] and his paper [41], after the acceptance of this paper, which optimized the step coefficients of the Krasnosel’skii-Mann iteration for a nonexpansive operator \({\varvec{T}} \), similarly using the PEP approach. The form of the resulting optimized method differs from that of the accelerated proximal point method proposed in this paper, but [62] recently showed that they are equivalent in the sense that they generate the same sequence, when \({\varvec{T}} = 2{\varvec{J}} _{\lambda {\varvec{M}}} - {\varvec{I}} \).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was supported in part by the National Research Foundation of Korea (NRF) Grant funded by the Korea Government (MSIT) (No. 2019R1A5A1028324), and the POSCO Science Fellowship of POSCO TJ Park Foundation.
Rights and permissions
About this article
Cite this article
Kim, D. Accelerated proximal point method for maximally monotone operators. Math. Program. 190, 57–87 (2021). https://doi.org/10.1007/s10107-021-01643-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-021-01643-0