×

First-order methods for convex optimization. (English) Zbl 1516.90048

Summary: First-order methods for solving convex optimization problems have been at the forefront of mathematical optimization in the last 20 years. The rapid development of this important class of algorithms is motivated by the success stories reported in various applications, including most importantly machine learning, signal processing, imaging and control theory. First-order methods have the potential to provide low accuracy solutions at low computational complexity which makes them an attractive set of tools in large-scale optimization problems. In this survey, we cover a number of key developments in gradient-based optimization methods. This includes non-Euclidean extensions of the classical proximal gradient method, and its accelerated versions. Additionally we survey recent developments within the class of projection-free methods, and proximal versions of primal-dual schemes. We give complete proofs for various key results, and highlight the unifying aspects of several optimization algorithms.

MSC:

90C25 Convex programming
90C06 Large-scale problems in mathematical programming
90C30 Nonlinear programming
65Y20 Complexity and performance of numerical algorithms
90-02 Research exposition (monographs, survey articles) pertaining to operations research and mathematical programming

References:

[1] Alacaoglu, A.; Tran Dinh, Q.; Fercoq, O.; Cevher, V., Smooth primal-dual coordinate descent algorithms for nonsmooth convex optimization, (Guyon, I.; Luxburg, U. V.; Bengio, S.; Wallach, H.; Fergus, R.; Vishwanathan, S.; Garnett, R., Advances in Neural Information Processing Systems, Vol. 30 (2017), Curran Associates, Inc.), 5852-5861
[2] Allen-Zhu, Z., Katyusha: The first direct acceleration of stochastic gradient methods, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017, 1200-1205 (2017), ACM: ACM New York, NY, USA · Zbl 1369.68273
[3] Allen-Zhu, Z.; Hazan, E., Optimal black-box reductions between optimization objectives, (Lee, D.; Sugiyama, M.; Luxburg, U.; Guyon, I.; Garnett, R., Advances in Neural Information Processing Systems, Vol. 29 (2016), Curran Associates, Inc.), 1614-1622
[4] Alvarez, F.; Bolte, J.; Brahic, O., Hessian Riemannian gradient flows in convex programming, SIAM Journal on Control and Optimization, 43, 2, 477-501 (2004) · Zbl 1077.34050
[5] Andersen, E. D.; Andersen, K. D., The Mosek Interior Point Optimizer for Linear Programming: An Implementation of the Homogeneous Algorithm, 197-232 (2000), Springer US, Boston, MA · Zbl 1028.90022
[6] Anikin, A. S.; Gasnikov, A. V.; Dvurechensky, P. E.; Tyurin, A. I.; Chernov, A. V., Dual approaches to the minimization of strongly convex functionals with a simple structure under affine constraints, Computational Mathematics and Mathematical Physics, 57, 8, 1262-1276 (2017) · Zbl 1380.49046
[7] Arrow, K.; Hurwicz, L.; Uzawa, H., Studies in linear and non-linear programming, (Chenery, H.; Johnson, S.; Karlin, S.; Marschak, T.; Solow, R., Stanford Mathematical Studies in the Social Sciences, Vol. II (1958), Stanford University Press: Stanford University Press Stanford) · Zbl 0091.16002
[8] Attouch, H.; Bolte, J.; Redont, P.; Teboulle, M., Singular Riemannian barrier methods and gradient-projection dynamical systems for constrained optimization, Optimization, 53, 5-6, 435-454 (2004) · Zbl 1153.34312
[9] Attouch, H.; Cabot, A.; Frankel, P.; Peypouquet, J., Alternating proximal algorithms for linearly constrained variational inequalities: application to domain decomposition for pde’s, Nonlinear Analysis: Theory, Methods & Applications, 74, 18, 7455-7473 (2011) · Zbl 1228.65100
[10] Attouch, H.; Chbani, Z.; Fadili, J.; Riahi, H., First-order optimization algorithms via inertial systems with hessian driven damping, Mathematical Programming (2020)
[11] Attouch, H.; Chbani, Z.; Peypouquet, J.; Redont, P., Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity, Mathematical Programming, 168, 1-2, 123-175 (2018) · Zbl 1395.34068
[12] Attouch, H.; Redont, P.; Soubeyran, A., A new class of alternating proximal minimization algorithms with costs-to-move, SIAM Journal on Optimization, 18, 3, 1061-1081 (2007) · Zbl 1149.65039
[13] Attouch, H.; Teboulle, M., Regularized lotka-volterra dynamical system as continuous proximal-like method in optimization, Journal of Optimization Theory and Applications, 121, 3, 541-570 (2004) · Zbl 1076.90053
[14] Auslender, A.; Teboulle, M., Asymptotic cones and functions in optimization and variational inequalities (2006), Springer Science & Business Media
[15] Auslender, A.; Teboulle, M., Interior gradient and proximal methods for convex and conic optimization, SIAM Journal on Optimization, 16, 3, 697-725 (2006) · Zbl 1113.90118
[16] Auslender, A.; Teboulle, M., Projected subgradient methods with non-euclidean distances for non-differentiable convex minimization and variational inequalities, Mathematical Programming, 120, 1, 27-48 (2009) · Zbl 1190.90118
[17] Bach, F., Duality between subgradient and conditional gradient methods, SIAM Journal on Optimization, 25, 1, 115-129 (2015) · Zbl 1358.90155
[18] Bach, F.; Levy, K. Y., A universal algorithm for variational inequalities adaptive to smoothness and noise, (Beygelzimer, A.; Hsu, D., Proceedings of the Thirty-Second Conference on Learning Theory, Vol. 99 of Proceedings of Machine Learning Research, PMLR, Phoenix, USA (2019)), 164-194
[19] Baes, M., Estimate sequence methods: extensions and approximations, Institute for Operations Research (2009), ETH, Zürich, Switzerland
[20] Bah, B.; Rauhut, H.; Terstiege, U.; Westdickenberg, M., Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers, Information and Inference: A Journal of the IMAIaaa039 (2019)
[21] Baimurzina, D. R.; Gasnikov, A. V.; Gasnikova, E. V.; Dvurechensky, P. E.; Ershov, E. I.; Kubentaeva, M. B.; Lagunovskaya, A. A., Universal method of searching for equilibria and stochastic equilibria in transportation networks, Computational Mathematics and Mathematical Physics, 59, 1, 19-33 (2019) · Zbl 1419.90013
[22] Banert, S.; BoŢ, R. I.; Csetnek, E. R., Fixing and extending some recent results on the admm algorithm, Numerical Algorithms, 86, 3, 1303-1325 (2021) · Zbl 1489.65082
[23] Bauschke, H.; Borwein, J.; Combettes, P., Bregman monotone optimization algorithms, SIAM Journal on Control and Optimization, 42, 2, 596-636 (2003) · Zbl 1049.90053
[24] Bauschke, H. H.; Bolte, J.; Teboulle, M., A descent lemma beyond lipschitz gradient continuity: First-order methods revisited and applications, Mathematics of Operations Research, 42, 2, 330-348 (2016) · Zbl 1364.90251
[25] Bauschke, H. H.; Combettes, P. L., Convex Analysis and Monotone Operator Theory in Hilbert Spaces (2016), Springer - CMS Books in Mathematics
[26] Ch. 8 · Zbl 1421.90112
[27] Beck, A., First-Order Methods in Optimization (2017), Society for Industrial and Applied Mathematics · Zbl 1384.65033
[28] Beck, A.; Guttmann-Beck, N., FOM -a matlab toolbox of first-order methods for solving convex optimization problems, Optimization Methods and Software, 34, 1, 172-193 (2019) · Zbl 1461.65127
[29] Beck, A.; Shtern, S., Linearly convergent away-step conditional gradient for non-strongly convex functions, Mathematical Programming, 164, 1-2, 1-27 (2017) · Zbl 1370.90010
[30] Beck, A.; Teboulle, M., Mirror descent and nonlinear projected subgradient methods for convex optimization, Operations Research Letters, 31, 3, 167-175 (2003) · Zbl 1046.90057
[31] Beck, A.; Teboulle, M., A conditional gradient method with linear rate of convergence for solving convex linear systems, Mathematical Methods of Operations Research, 59, 2, 235-247 (2004) · Zbl 1138.90440
[32] Beck, A.; Teboulle, M., A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM Journal on Imaging Sciences, 2, 1, 183-202 (2009) · Zbl 1175.94009
[33] Beck, A.; Teboulle, M., Gradient-based algorithms with applications to signal recovery, (Palomar, D. P.; Eldar, Y. C., Convex optimization in signal processing and communications (2009), Cambridge University Press), 42-88 · Zbl 1211.90290
[34] Beck, A.; Teboulle, M., Smoothing and first order methods: A unified framework, SIAM Journal on Optimization, 22, 2, 557-580 (2012) · Zbl 1251.90304
[35] Ben-Tal, A.; Margalit, T.; Nemirovski, A., The ordered subsets mirror descent optimization method with applications to tomography, SIAM Journal on Optimization, 12, 1, 79-108 (2001) · Zbl 0992.92034
[36] URL https://www2.isye.gatech.edu/ nemirovs/LMCOLN2020WithSol.pdf
[37] Benaïm, M., Recursive algorithms, urn processes, and the chaining number of chain recurrent sets, Ergodic Theory and Dynamical Systems, 18, 53-87 (1998) · Zbl 0921.60061
[38] Benveniste, A.; Métivier, M.; Priouret, P., Adaptive Algorithms and Stochastic Approximations (1990), Springer: Springer Berlin · Zbl 0752.93073
[39] Bian, W.; Chen, X., Linearly constrained non-lipschitz optimization for image restoration, SIAM Journal on Imaging Sciences, 8, 4, 2294-2322 (2015) · Zbl 1327.90299
[40] Bian, W.; Chen, X.; Ye, Y., Complexity analysis of interior point algorithms for non-lipschitz and nonconvex minimization, Mathematical Programming, 149, 1, 301-327 (2015) · Zbl 1318.90075
[41] Bolte, J.; Nguyen, T. P.; Peypouquet, J.; Suter, B. W., From error bounds to the complexity of first-order descent methods for convex functions, Mathematical Programming, 165, 2, 471-507 (2017) · Zbl 1373.90076
[42] Bolte, J.; Teboulle, M., Barrier operators and associated gradient-like dynamical systems for constrained minimization problems, SIAM Journal on Control and Optimization, 42, 4, 1266-1292 (2003) · Zbl 1051.49010
[43] Bomze, I. M.; Mertikopoulos, P.; Schachinger, W.; Staudigl, M., Hessian barrier algorithms for linearly constrained optimization problems, SIAM Journal on Optimization, 29, 3, 2100-2127 (2019) · Zbl 1421.90164
[44] Bottou, L.; Curtis, F.; Nocedal, J., Optimization methods for large-scale machine learning, SIAM Review, 60, 2, 223-311 (2018) · Zbl 1397.65085
[45] Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J., Distributed optimization and statistical learning via the alternating direction method of multipliers, Foundations and Trends® in Machine learning, 3, 1, 1-122 (2011) · Zbl 1229.90122
[46] Boyd, S.; Vandenberghe, L., Convex optimization (2004), Cambridge university press · Zbl 1058.90049
[47] Bruckstein, A.; Donoho, D.; Elad, M., From sparse solutions of systems of equations to sparse modeling of signals and images, SIAM Review, 51, 1, 34-81 (2009) · Zbl 1178.68619
[48] Bubeck, S., Convex optimization: Algorithms and complexity, Foundations and Trends in Machine Learning, 8, 3-4, 231-357 (2015) · Zbl 1365.90196
[49] Bühlmann, P.; van de Geer, S., Statistics for high-dimensional data, Springer Series in Statistics (2011), Springer-Verlag Berlin Heidelberg · Zbl 1273.62015
[50] Bùi, M. N.; Combettes, P. L., Bregman forward-backward operator splitting, Set-Valued and Variational Analysis, 29, 3, 583-603 (2021) · Zbl 1487.47097
[51] Candes, E.; Tao, T., The Dantzig selector: Statistical estimation when \(p\) is much larger than \(n\), The Annals of Statistics, 35, 6, 2313-2351 (2007) · Zbl 1139.62019
[52] Candes, E. J.; Romberg, J.; Tao, T., Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, IEEE Transactions on Information Theory, 52, 2, 489-509 (2006) · Zbl 1231.94017
[53] Canon, M. D.; Cullum, C. D., A tight upper bound on the rate of convergence of frank-wolfe algorithm, SIAM Journal on Control, 6, 4, 509-516 (1968) · Zbl 0186.24002
[54] Carderera, A.; Besancon, M.; Pokutta, S., Simple steps are all you need: Frank-wolfe and generalized self-concordant functions
[55] Carderera, A.; Diakonikolas, J.; Lin, C. Y.; Pokutta, S., Parameter-free locally accelerated conditional gradients, (Meila, M.; Zhang, T., Proceedings of the 38th International Conference on Machine Learning, Vol. 139 of Proceedings of Machine Learning Research, PMLR (2021)), 1283-1293
[56] Censor, Y.; Zenios, S. A., Proximal minimization algorithm withd-functions, Journal of Optimization Theory and Applications, 73, 3, 451-464 (1992) · Zbl 0794.90058
[57] Cesa-Bianchi, N.; Lugosi, G., Prediction, Learning, and Games (2006), Cambridge University Press · Zbl 1114.91001
[58] Chambolle, A.; Pock, T., A first-order primal-dual algorithm for convex problems with applications to imaging, Journal of Mathematical Imaging and Vision, 40, 1, 120-145 (2011) · Zbl 1255.68217
[59] Chen, C.; He, B.; Ye, Y.; Yuan, X., The direct extension of admm for multi-block convex minimization problems is not necessarily convergent, Mathematical Programming, 155, 1-2, 57-79 (2016) · Zbl 1332.90193
[60] Chen, G.; Teboulle, M., Convergence analysis of a proximal-like minimization algorithm using Bregman functions, SIAM Journal on Optimization, 3, 3, 538-543 (1993) · Zbl 0808.90103
[61] 10.1007/s10107-017-1161-4
[62] Chernov, A.; Dvurechensky, P.; Gasnikov, A., Fast primal-dual gradient method for strongly convex minimization problems with linear constraints, (Kochetov, Y.; Khachay, M.; Beresnev, V.; Nurminski, E.; Pardalos, P., Discrete Optimization and Operations Research: 9th International Conference, DOOR 2016, Vladivostok, Russia, September 19-23, 2016, Proceedings (2016), Springer International Publishing), 391-403 · Zbl 1391.90471
[63] Cohen, M.; Diakonikolas, J.; Orecchia, L., On acceleration with noise-corrupted gradients, (Dy, J.; Krause, A., Proceedings of the 35th International Conference on Machine Learning, Vol. 80 of Proceedings of Machine Learning Research, PMLR (2018), Stockholmsmässan, Stockholm Sweden), 1019-1028
[64] Vol. 185 of Leibniz International Proceedings in Informatics (LIPIcs), Schloss Dagstuhl-Leibniz-Zentrum für Informatik, Dagstuhl, Germany
[65] Combettes, P. L.; Pesquet, J. C., Proximal splitting methods in signal processing, 185-212 (2011), Springer · Zbl 1242.90160
[66] Combettes, P. L.; Wajs, V. R., Signal recovery by proximal forward-backward splitting, Multiscale Modeling & Simulation, 4, 4, 1168-1200 (2005) · Zbl 1179.94031
[67] Cox, B.; Juditsky, A.; Nemirovski, A., Dual subgradient algorithms for large-scale nonsmooth learning problems, Mathematical Programming, 148, 1, 143-180 (2014) · Zbl 1305.65149
[68] Damla Ahipasaoglu, S.; Sun, P.; Todd, M. J., Linear convergence of a modified frank-wolfe algorithm for computing minimum-volume enclosing ellipsoids, Optimisation Methods and Software, 23, 1, 5-19 (2008) · Zbl 1146.90047
[69] Danilova, M.; Dvurechensky, P.; Gasnikov, A.; Gorbunov, E.; Guminov, S.; Kamzolov, D.; Shibaev, I., Recent theoretical advances in non-convex optimization, arXiv:2012.06188Accepted to be a part of Springer volume “High Dimensional Optimization and Probability” (2020)
[70] d’Aspremont, A., Smooth optimization with approximate gradient, SIAM J. on Optimization, 19, 3, 1171-1183 (2008) · Zbl 1180.90378
[71] d’Aspremont, A.; Scieur, D.; Taylor, A., Acceleration methods
[72] Daubechies, I.; Defrise, M.; De Mol, C., An iterative thresholding algorithm for linear inverse problems with a sparsity constraint, Communications on pure and applied mathematics, 57, 11, 1413-1457 (2004) · Zbl 1077.65055
[73] Davis, D.; Drusvyatskiy, D.; Kakade, S.; Lee, J. D., Stochastic subgradient method converges on tame functions, Foundations of Computational Mathematics, 20, 1, 119-154 (2020) · Zbl 1433.65141
[74] Devolder, O., Exactness, inexactness and stochasticity in first-order methods for large-scale convex optimization (2013), Ph.d. thesis
[75] Devolder, O.; Glineur, F.; Nesterov, Y., First-order methods of smooth convex optimization with inexact oracle, Mathematical Programming, 146, 1, 37-75 (2014) · Zbl 1317.90196
[76] Diakonikolas, J.; Orecchia, L., Alternating randomized block coordinate descent, (Dy, J.; Krause, A., Proceedings of the 35th International Conference on Machine Learning, Vol. 80 of Proceedings of Machine Learning Research, PMLR, Stockholmsmässan, Stockholm Sweden (2018)), 1224-1232
[77] Doljansky, M.; Teboulle, M., An interior proximal algorithm and the exponential multiplier method for semidefinite programming, SIAM Journal on Optimization, 9, 1, 1-13 (1998) · Zbl 0960.90066
[78] Donoho, D. L., Compressed sensing, IEEE Transactions on Information Theory, 52, 4, 1289-1306 (2006) · Zbl 1288.94016
[79] Dragomir, R.-A.; Taylor, A. B.; d’Aspremont, A.; Bolte, J., Optimal complexity and certification of bregman first-order methods, Mathematical Programming (2021)
[80] (Jul.) · Zbl 1280.68164
[81] Duchi, J.; Shalev-Shwartz, S.; Singer, Y.; Tewari, A., Composite objective mirror descent, COLT 2010 - The 23rd Conference on Learning Theory, 14-26 (2010)
[82] Duchi, J. C.; Ruan, F., Stochastic methods for composite and weakly convex optimization problems, SIAM Journal on Optimization, 28, 4, 3229-3259 (2018) · Zbl 06989166
[83] Dunn, J. C., Rates of convergence for conditional gradient algorithms near singular and nonsingular extremals, SIAM Journal on Control and Optimization, 17, 2, 187-211 (1979) · Zbl 0403.49028
[84] Dvinskikh, D.; Gorbunov, E.; Gasnikov, A.; Dvurechensky, P.; Uribe, C. A., On primal and dual approaches for distributed stochastic convex optimization over networks, 2019 IEEE 58th Conference on Decision and Control (CDC), 7435-7440 (2019)
[85] Dvinskikh, D.; Ogaltsov, A.; Gasnikov, A.; Dvurechensky, P.; Spokoiny, V., On the line-search gradient methods for stochastic optimization, IFAC-PapersOnLine, 53, 2, 1715-1720 (2020)
[86] NeurIPS 2018
[87] Dvurechensky, P.; Gasnikov, A., Stochastic intermediate gradient method for convex problems with stochastic inexact oracle, Journal of Optimization Theory and Applications, 171, 1, 121-145 (2016) · Zbl 1351.90150
[88] Dvurechensky, P.; Gasnikov, A.; Gasnikova, E.; Matsievsky, S.; Rodomanov, A.; Usik, I., Primal-dual method for searching equilibrium in hierarchical congestion population games, Supplementary Proceedings of the 9th International Conference on Discrete Optimization and Operations Research and Scientific School (DOOR 2016) Vladivostok, Russia, September 19 - 23, 2016, 584-595 (2016)
[89] Dvurechensky, P.; Gasnikov, A.; Kroshnin, A., Computational optimal transport: Complexity by accelerated gradient descent is better than by Sinkhorn’s algorithm, (Dy, J.; Krause, A., Proceedings of the 35th International Conference on Machine Learning, Vol. 80 of Proceedings of Machine Learning Research (2018)), 1367-1376
[90] Dvurechensky, P.; Gasnikov, A.; Tiurin, A.; Zholobov, V., Unifying framework for accelerated randomized methods in convex optimization · Zbl 07822579
[91] Dvurechensky, P.; Gorbunov, E.; Gasnikov, A., An accelerated directional derivative method for smooth stochastic convex optimization, European Journal of Operational Research, 290, 2, 601-621 (2021) · Zbl 1487.90524
[92] Dvurechensky, P.; Kamzolov, D.; Lukashevich, A.; Lee, S.; Ordentlich, E.; Uribe, C. A.; Gasnikov, A., Hyperfast second-order local solvers for efficient statistically preconditioned distributed optimization · Zbl 1530.90061
[93] Dvurechensky, P.; Ostroukhov, P.; Safin, K.; Shtern, S.; Staudigl, M., Self-concordant analysis of frank-Wolfe algorithms, (Daumé III, H.; Singh, A., Proceedings of the 37th International Conference on Machine Learning, Vol. 119 of Proceedings of Machine Learning Research, PMLR, Virtual (2020)), 2814-2824
[94] Dvurechensky, P.; Safin, K.; Shtern, S.; Staudigl, M., Generalized self-concordant analysis of Frank-Wolfe algorithms · Zbl 1512.90163
[95] Dvurechensky, P. E.; Gasnikov, A. V.; Nurminski, E. A.; Stonyakin, F. S., Advances in Low-Memory Subgradient Optimization, 19-59 (2020), Springer International Publishing: Springer International Publishing Cham
[96] Eckstein, J., Some saddle-function splitting methods for convex programming, Optimization Methods and Software, 4, 1, 75-83 (1994)
[97] Eckstein, J.; Bertsekas, D. P., On the douglas—rachford splitting method and the proximal point algorithm for maximal monotone operators, Mathematical Programming, 55, 1-3, 293-318 (1992) · Zbl 0765.90073
[98] Epelman, M.; Freund, R. M., Condition number complexity of an elementary algorithm for computing a reliable solution of a conic linear system, Mathematical Programming, 88, 3, 451-485 (2000) · Zbl 0989.65061
[99] Facchinei, F.; Pang, J. S., Finite-dimensional variational inequalities and complementarity problems - volume i and volume II, Springer Series in Operations Research (2003) · Zbl 1062.90002
[100] Feizollahi, M. J.; Ahmed, S.; Sun, A., Exact augmented lagrangian duality for mixed integer linear programming, Mathematical Programming, 161, 1, 365-387 (2017) · Zbl 1364.90226
[101] Fercoq, O.; Qu, Z., Restarting the accelerated coordinate descent method with a rough strong convexity estimate, Computational Optimization and Applications, 75, 1, 63-91 (2020) · Zbl 1432.90109
[102] Fercoq, O.; Richtárik, P., Accelerated, parallel, and proximal coordinate descent, SIAM Journal on Optimization, 25, 4, 1997-2023 (2015) · Zbl 1327.65108
[103] Frank, M.; Wolfe, P., An algorithm for quadratic programming, Naval research logistics quarterly, 3, 1-2, 95-110 (1956)
[104] Freund, R. M.; Grigas, P., New analysis and results for the frank-wolfe method, Mathematical Programming, 155, 1-2, 199-230 (2016) · Zbl 1342.90101
[105] Frostig, R.; Ge, R.; Kakade, S.; Sidford, A., Un-regularizing: approximate proximal point and faster stochastic algorithms for empirical risk minimization, (Bach, F.; Blei, D., Proceedings of the 32nd International Conference on Machine Learning, Vol. 37 of Proceedings of Machine Learning Research, PMLR, Lille, France (2015)), 2540-2548
[106] Ch. ix
[107] Garber, D.; Hazan, E., Faster rates for the frank-wolfe method over strongly-convex sets, 32nd International Conference on Machine Learning, ICML 2015 (2015)
[108] Garber, D.; Hazan, E., A linearly convergent variant of the conditional gradient algorithm under strong convexity, with applications to online and stochastic optimization, SIAM Journal on Optimization, 26, 3, 1493-1528 (2016) · Zbl 1342.65142
[109] Gasnikov, A.; Dvurechensky, P.; Gorbunov, E.; Vorontsova, E.; Selikhanovych, D.; Uribe, C. A.; Jiang, B.; Wang, H.; Zhang, S.; Bubeck, S.; Jiang, Q.; Lee, Y. T.; Li, Y.; Sidford, A., Near optimal methods for minimizing convex functions with lipschitz \(p\)-th derivatives, (Beygelzimer, A.; Hsu, D., Proceedings of the Thirty-Second Conference on Learning Theory, Vol. 99 of Proceedings of Machine Learning Research, PMLR, Phoenix, USA (2019)), 1392-1393
[110] Gasnikov, A. V.; Dvurechensky, P. E., Stochastic intermediate gradient method for convex optimization problems, Doklady Mathematics, 93, 2, 148-151 (2016) · Zbl 1342.90131
[111] Gasnikov, A. V.; Nesterov, Y. E., Universal method for stochastic composite optimization problems, Computational Mathematics and Mathematical Physics, 58, 1, 48-64 (2018) · Zbl 1457.90099
[112] Gasnikov, A. V.; Tyurin, A. I., Fast gradient descent for convex minimization problems with an oracle producing a \(( \delta \), l)-model of function at the requested point, Computational Mathematics and Mathematical Physics, 59, 7, 1085-1097 (2019) · Zbl 1531.90101
[113] Ghadimi, S.; Lan, G., Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization i: A generic algorithmic framework, SIAM Journal on Optimization, 22, 4, 1469-1492 (2012) · Zbl 1301.62077
[114] Ghadimi, S.; Lan, G., Optimal stochastic approximation algorithms for strongly convex stochastic composite optimization, ii: Shrinking procedures and optimal algorithms, SIAM Journal on Optimization, 23, 4, 2061-2089 (2013) · Zbl 1293.62167
[115] Ghadimi, S.; Lan, G.; Zhang, H., Generalized uniformly optimal methods for nonlinear programming, Journal of Scientific Computing, 79, 3, 1854-1881 (2019) · Zbl 1418.90191
[116] Glowinski, R.; Tallec, P. L., Augmented Lagrangian and operator-splitting methods in nonlinear mechanics (1989), SIAM · Zbl 0698.73001
[117] Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y., Generative adversarial nets, Advances in neural information processing systems, 2672-2680 (2014)
[118] Gorbunov, E.; Danilova, M.; Gasnikov, A., Stochastic optimization with heavy-tailed noise via accelerated gradient clipping, (Larochelle, H.; Ranzato, M.; Hadsell, R.; Balcan, M. F.; Lin, H., Advances in Neural Information Processing Systems, Vol. 33 (2020), Curran Associates, Inc.), 15042-15053
[119] Gorbunov, E.; Danilova, M.; Shibaev, I.; Dvurechensky, P.; Gasnikov, A., Near-optimal high probability complexity bounds for non-smooth stochastic optimization with heavy-tailed noise
[120] Gorbunov, E.; Dvinskikh, D.; Gasnikov, A., Optimal decentralized distributed algorithms for stochastic convex optimization
[121] Gorbunov, E.; Dvurechensky, P.; Gasnikov, A., An accelerated method for derivative-free smooth stochastic convex optimization · Zbl 1494.90058
[122] Gorbunov, E., Rogozin, A., Beznosikov, A., Dvinskikh, D., Gasnikov, A., d. Recent theoretical advances in decentralized distributed convex optimization. arXiv preprint arXiv:2011.13259Accepted to be a part of Springer volume “High Dimensional Optimization and Probability”.
[123] Gower, R. M.; Schmidt, M.; Bach, F.; Richtárik, P., Variance-reduced methods for machine learning, Proceedings of the IEEE, Vol. 108, 1968-1983 (2020)
[124] Guélat, J.; Marcotte, P., Some comments on wolfe’s ‘away step’, Mathematical Programming, 35, 1, 110-119 (1986) · Zbl 0592.90074
[125] Guminov, S.; Dvurechensky, P.; Tupitsa, N.; Gasnikov, A., On a combination of alternating minimization and Nesterov’s momentum, (Meila, M.; Zhang, T., Proceedings of the 38th International Conference on Machine Learning, Vol. 139 of Proceedings of Machine Learning Research, PMLR, Virtual (2021)), 3886-3898
[126] Guminov, S. V.; Nesterov, Y. E.; Dvurechensky, P. E.; Gasnikov, A. V., Accelerated primal-dual gradient descent with linesearch for convex, nonconvex, and nonsmooth optimization problems, Doklady Mathematics, 99, 2, 125-128 (2019) · Zbl 1418.90192
[127] Haeser, G.; Liu, H.; Ye, Y., Optimality condition and complexity analysis for linearly-constrained optimization without differentiability on the boundary, Mathematical Programming (2018)
[128] Hanzely, F.; Richtárik, P.; Xiao, L., Accelerated bregman proximal gradient methods for relatively smooth convex optimization, Computational Optimization and Applications, 79, 2, 405-440 (2021) · Zbl 1473.90114
[129] Helmke, U.; Moore, J. B., Optimization and dynamical systems, Communications & Control Engineering (1996), Springer Berlin Heidelberg · Zbl 0943.93001
[130] Hendrikx, H.; Xiao, L.; Bubeck, S.; Bach, F.; Massoulie, L., Statistically preconditioned accelerated gradient method for distributed optimization, (Daumé III, H.; Singh, A., Proceedings of the 37th International Conference on Machine Learning, Vol. 119 of Proceedings of Machine Learning Research, PMLR (2020)), 4203-4227
[131] Hiriart-Urrut, J.-B.; Lemaréchal, C., Fundamentals of Convex Analysis (2001), Springer · Zbl 0998.49001
[132] Holloway, C. A., An extension of the frank and wolfe method of feasible directions, Mathematical Programming, 6, 1, 14-27 (1974) · Zbl 0283.90042
[133] Ivanova, A.; Dvurechensky, P.; Gasnikov, A.; Kamzolov, D., Composite optimization for the resource allocation problem, Optimization Methods and Software, 0, 0, 1-35 (2020)
[134] Jaggi, M., Revisiting frank-wolfe: Projection-free sparse convex optimization, International Conference on Machine Learning, 427-435 (2013)
[135] Jain, P.; Kar, P., Non-convex optimization for machine learning, Found. Trends Mach. Learn., 10, 3-4, 142-336 (2017) · Zbl 1388.68251
[136] Juditsky, A.; Kılınç Karzan, F.; Nemirovski, A., Randomized first order algorithms with applications to \(\ell_1\)-minimization, Mathematical Programming, 142, 1, 269-310 (2013) · Zbl 1282.90128
[137] Juditsky, A.; Nazin, A. V.; Tsybakov, A. B.; Vayatis, N., Recursive aggregation of estimators by the mirror descent algorithm with averaging, Problems of Information Transmission, 41, 4, 368-384 (2005) · Zbl 1123.62044
[138] Ch. 5
[139] Ch. 6
[140] Juditsky, A.; Nesterov, Y., Deterministic and stochastic primal-dual subgradient algorithms for uniformly convex minimization, Stochastic Systems, 4, 1, 44-80 (2014) · Zbl 1297.90097
[141] Kairouz, P.; McMahan, H. B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A. N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; D’Oliveira, R. G.L.; Eichner, H.; Rouayheb, S. E.; Evans, D.; Gardner, J.; Garrett, Z.; Gascón, A.; Ghazi, B.; Gibbons, P. B.; Gruteser, M.; Harchaoui, Z.; He, C.; He, L.; Huo, Z.; Hutchinson, B.; Hsu, J.; Jaggi, M.; Javidi, T.; Joshi, G.; Khodak, M.; Konecný, J.; Korolova, A.; Koushanfar, F.; Koyejo, S.; Lepoint, T.; Liu, Y.; Mittal, P.; Mohri, M.; Nock, R.; Özgür, A.; Pagh, R.; Qi, H.; Ramage, D.; Raskar, R.; Raykova, M.; Song, D.; Song, W.; Stich, S. U.; Sun, Z.; Suresh, A. T.; Tramèr, F.; Vepakomma, P.; Wang, J.; Xiong, L.; Xu, Z.; Yang, Q.; Yu, F. X.; Yu, H.; Zhao, S., Advances and open problems in federated learning, Foundations and Trends®in Machine Learning, 14, 1-2, 1-210 (2021)
[142] Kamzolov, D.; Dvurechensky, P.; Gasnikov, A. V., Universal intermediate gradient method for convex problems with inexact oracle, Optimization Methods and Software, 0, 0, 1-28 (2020)
[143] Kerdreux, T.; d’Aspremont, A.; Pokutta, S., Restarting frank-wolfe, (Chaudhuri, K.; Sugiyama, M., Proceedings of Machine Learning Research, Vol. 89 of Proceedings of Machine Learning Research, PMLR (2019)), 1275-1283
[144] Kroshnin, A.; Tupitsa, N.; Dvinskikh, D.; Dvurechensky, P.; Gasnikov, A.; Uribe, C., On the complexity of approximating Wasserstein barycenters, (Chaudhuri, K.; Salakhutdinov, R., Proceedings of the 36th International Conference on Machine Learning, Vol. 97 of Proceedings of Machine Learning Research, PMLR, Long Beach, California, USA (2019)), 3530-3540
[145] Kuczyński, J.; Woźniakowski, H., Estimating the largest eigenvalue by the power and lanczos algorithms with a random start, SIAM Journal on Matrix Analysis and Applications, 13, 4, 1094-1122 (1992) · Zbl 0759.65016
[146] Kushner, H. J., Approximation and Weak Convergence Methods for Random Processes (1984), The MIT Press · Zbl 0551.60056
[147] Lacoste-Julien, S.; Jaggi, M., On the global linear convergence of frank-wolfe optimization variants, Advances in neural information processing systems, 28, 496-504 (2015)
[148] Lan, G., The complexity of large-scale convex programming under a linear optimization oracle
[149] Lan, G., An optimal method for stochastic composite optimization, Mathematical Programming, 133, 1, 365-397 (2012) · Zbl 1273.90136
[150] Lan, G., First-order and Stochastic Optimization Methods for Machine Learning (2020), Springer · Zbl 1442.68003
[151] Lan, G.; Zhou, Y., An optimal randomized incremental gradient method, Mathematical Programming (2017)
[152] Lan, G.; Zhou, Y., Conditional gradient sliding for convex optimization, SIAM Journal on Optimization, 26, 2, 1379-1409 (2016) · Zbl 1342.90132
[153] Lee, Y. T.; Sidford, A., Efficient accelerated coordinate descent methods and faster algorithms for solving linear systems, Proceedings of the 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, FOCS ’13, 147-156 (2013), IEEE Computer Society: IEEE Computer Society Washington, DC, USA
[154] Levitin, E. S.; Polyak, B. T., Constrained minimization methods, USSR Computational mathematics and mathematical physics, 6, 5, 1-50 (1966) · Zbl 0184.38902
[155] Levy, K. Y.; Yurtsever, A.; Cevher, V., Online adaptive methods, universality and acceleration, (Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K.; Cesa-Bianchi, N.; Garnett, R., Advances in Neural Information Processing Systems, Vol. 31 (2018), Curran Associates, Inc.), 6500-6509
[156] Lin, F.; Fardad, M.; Jovanović, M. R., Sparse feedback synthesis via the alternating direction method of multipliers, 2012 American Control Conference (ACC), 4765-4770 (2012), IEEE
[157] Lin, H.; Mairal, J.; Harchaoui, Z., A universal catalyst for first-order optimization, Proceedings of the 28th International Conference on Neural Information Processing Systems, NIPS’15, 3384-3392 (2015), MIT Press, Cambridge, MA, USA
[158] Lin, Q.; Lu, Z.; Xiao, L., An accelerated proximal coordinate gradient method, (Ghahramani, Z.; Welling, M.; Cortes, C.; Lawrence, N. D.; Weinberger, K. Q., Advances in Neural Information Processing Systems, Vol. 27 (2014), Curran Associates, Inc.), 3059-3067
[159] Lin, T.; Ho, N.; Chen, X.; Cuturi, M.; Jordan, M., Fixed-support wasserstein barycenters: Computational hardness and fast algorithm, (Larochelle, H.; Ranzato, M.; Hadsell, R.; Balcan, M. F.; Lin, H., Advances in Neural Information Processing Systems, Vol. 33 (2020), Curran Associates, Inc.), 5368-5380
[160] Lin, T.; Ho, N.; Cuturi, M.; Jordan, M. I., On the complexity of approximating multimarginal optimal transport
[161] Lin, T.; Ho, N.; Jordan, M., On efficient optimal transport: An analysis of greedy and accelerated mirror descent algorithms, (Chaudhuri, K.; Salakhutdinov, R., Proceedings of the 36th International Conference on Machine Learning, Vol. 97 of Proceedings of Machine Learning Research, PMLR, Long Beach, California, USA (2019)), 3982-3991
[162] Ljung, L.; Pflug, G.; Walk, H., Stochastic approximation and optimization of random systems, 17 (2012), Birkhäuser
[163] Lu, H.; Freund, R.; Nesterov, Y., Relatively smooth convex optimization by first-order methods, and applications, SIAM Journal on Optimization, 28, 1, 333-354 (2018) · Zbl 1392.90090
[164] Malitsky, Y.; Pock, T., A first-order primal-dual algorithm with linesearch, SIAM Journal on Optimization, 28, 1, 411-432 (2018) · Zbl 1390.49033
[165] Martinet, B., Régularisation d’inéquations variationnelles par approximations successives, Revue française d’informatique et de recherche opérationnelle. Série rouge, 4, R3, 154-158 (1970) · Zbl 0215.21103
[166] Mertikopoulos, P.; Staudigl, M., On the convergence of gradient-like flows with noisy gradient input, SIAM Journal on Optimization, 28, 1, 163-197 (2018) · Zbl 1387.90187
[167] Mertikopoulos, P.; Staudigl, M., Stochastic mirror descent dynamics and their convergence in monotone variational inequalities, Journal of Optimization Theory and Applications, 179, 3, 838-867 (2018) · Zbl 1402.90122
[168] Monteiro, R.; Svaiter, B., An accelerated hybrid proximal extragradient method for convex optimization and its implications to second-order methods, SIAM Journal on Optimization, 23, 2, 1092-1125 (2013) · Zbl 1298.90071
[169] Moreau, J. J., Proximité et dualité dans un espace hilbertien, Bulletin de la Société mathématique de France, 93, 273-299 (1965) · Zbl 0136.12101
[170] Necoara, I.; Nesterov, Y.; Glineur, F., Linear convergence of first order methods for non-strongly convex optimization, Mathematical Programming, 175, 1, 69-107 (2019) · Zbl 1412.90111
[171] Nemirovski, A., Orth-method for smooth convex optimization, Izvestia AN SSSR, Transl.: Eng. Cybern. Soviet J. Comput. Syst. Sci, 2, 937-947 (1982)
[172] Nemirovski, A., Prox-method with rate of convergence \(o ( 1 / t )\) for variational inequalities with lipschitz continuous monotone operators and smooth convex-concave saddle point problems, SIAM Journal on Optimization, 15, 1, 229-251 (2004) · Zbl 1106.90059
[173] Nemirovski, A. S.; Yudin, D. B., Problem Complexity and Method Efficiency in Optimization (1983), Wiley: Wiley New York, NY · Zbl 0501.90062
[174] Nemirovskii, A.; Nesterov, Y., Optimal methods of smooth convex minimization, USSR Computational Mathematics and Mathematical Physics, 25, 2, 21-30 (1985) · Zbl 0591.90072
[175] Nesterov, Y., Implementable tensor methods in unconstrained convex optimization, Mathematical Programming (2019) · Zbl 1459.90157
[176] Nesterov, Y., A method of solving a convex programming problem with convergence rate \(o ( 1 / k^2 )\), Soviet Mathematics Doklady, 27, 2, 372-376 (1983) · Zbl 0535.90071
[177] Nesterov, Y., Excessive gap technique in nonsmooth convex minimization, SIAM Journal on Optimization, 16, 1, 235-249 (2005) · Zbl 1096.90026
[178] Nesterov, Y., Smooth minimization of non-smooth functions, Mathematical Programming, 103, 1, 127-152 (2005) · Zbl 1079.90102
[179] Nesterov, Y., Dual extrapolation and its applications to solving variational inequalities and related problems, Mathematical Programming, 109, 2, 319-344 (2007) · Zbl 1167.90014
[180] Nesterov, Y., Accelerating the cubic regularization of newton’s method on convex problems, Mathematical Programming, 112, 1, 159-181 (2008) · Zbl 1167.90013
[181] Nesterov, Y., Primal-dual subgradient methods for convex problems, Mathematical Programming, 120, 1, 221-259 (2009) · Zbl 1191.90038
[182] Nesterov, Y., Efficiency of coordinate descent methods on huge-scale optimization problems, SIAM Journal on Optimization, 22, 2, 341-362 (2012) · Zbl 1257.90073
[183] Nesterov, Y., Gradient methods for minimizing composite functions, Mathematical Programming, 140, 1, 125-161 (2013) · Zbl 1287.90067
[184] Nesterov, Y., Universal gradient methods for convex optimization problems, Mathematical Programming, 152, 1, 381-404 (2015) · Zbl 1327.90216
[185] Nesterov, Y., Complexity bounds for primal-dual methods minimizing the model of objective function, Mathematical Programming, 171, 1-2, 311-330 (2018) · Zbl 1397.90351
[186] Nesterov, Y., Lectures on convex optimization, Vol. 137 of Springer Optimization and Its Applications (2018), Springer International Publishing · Zbl 1427.90003
[187] Nesterov, Y.; Gasnikov, A.; Guminov, S.; Dvurechensky, P., Primal-dual accelerated gradient methods with small-dimensional relaxation oracle, Optimization Methods and Software, 1-28 (2020)
[188] Nesterov, Y.; Nemirovski, A., Interior Point Polynomial methods in Convex programming (1994), SIAM Publications · Zbl 0824.90112
[189] Nesterov, Y.; Spokoiny, V., Random gradient-free minimization of convex functions, Found. Comput. Math., 17, 2, 527-566 (2017) · Zbl 1380.90220
[190] First presented in May 2015 http://www.mathnet.ru:8080/PresentFiles/11909/7_nesterov.pdf · Zbl 1359.90073
[191] Odor, G.; Li, Y.-H.; Yurtsever, A.; Hsieh, Y.-P.; Tran-Dinh, Q.; Halabi, M. E.; Cevher, V., Frank-wolfe works for non-lipschitz continuous gradient objectives: scalable poisson phase retrieval, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 6230-6234 (2016), Ieee
[192] Opial, Z., Weak convergence of the sequence of successive approximations for nonexpansive mappings, Bulletin of the American Mathematical Society, 73, 4, 591-597 (1967) · Zbl 0179.19902
[193] Parikh, N.; Boyd, S., Proximal algorithms, Foundations and Trends® in Optimization, 1, 3, 127-239 (2014)
[194] Pedregosa, F.; Negiar, G.; Askari, A.; Jaggi, M., Linearly convergent frank-wolfe with backtracking line-search, International Conference on Artificial Intelligence and Statistics, PMLR, 1-10 (2020)
[195] Pokutta, S., Restarting algorithms: Sometimes there is free lunch, (Hebrard, E.; Musliu, N., Integration of Constraint Programming, Artificial Intelligence, and Operations Research (2020), Springer International Publishing, Cham), 22-38 · Zbl 07636009
[196] Polyak, B. T., Some methods of speeding up the convergence of iteration methods, USSR Computational Mathematics and Mathematical Physics, 4, 5, 1-17 (1964) · Zbl 0147.35301
[197] Polyak, B. T., Introduction to Optimization (1987), Optimization Software
[198] Robbins, H.; Monro, S., A stochastic approximation method, The Annals of Mathematical Statistics, 22, 3, 400-407 (1951) · Zbl 0054.05901
[199] Robinson, S. M., Generalized equations and their solutions, part ii: applications to nonlinear programming, Optimality and Stability in Mathematical Programming, 200-221 (1982), Springer · Zbl 0495.90077
[200] Rockafellar, R. T., Convex Analysis (1970), Princeton University Press: Princeton University Press Princeton · Zbl 0202.14303
[201] Rockafellar, R. T., Augmented lagrangians and applications of the proximal point algorithm in convex programming, Mathematics of operations research, 1, 2, 97-116 (1976) · Zbl 0402.90076
[202] Rockafellar, R. T., Monotone operators and the proximal point algorithm, SIAM journal on control and optimization, 14, 5, 877-898 (1976) · Zbl 0358.90053
[203] Rockafellar, R. T.; Wets, R. J.B., Variational analysis, Vol. 317 of A Series of Comprehensive Studies in Mathematics (1998), Springer-Verlag: Springer-Verlag Berlin · Zbl 0888.49001
[204] Rogozin, A.; Bochko, M.; Dvurechensky, P.; Gasnikov, A.; Lukoshkin, V., An accelerated method for decentralized distributed stochastic optimization over time-varying graphs, 2021 60th IEEE Conference on Decision and Control (CDC) (2021)
[205] Roulet, V.; d’Aspremont, A., Sharpness, restart and acceleration, (Guyon, I.; Luxburg, U. V.; Bengio, S.; Wallach, H.; Fergus, R.; Vishwanathan, S.; Garnett, R., Advances in Neural Information Processing Systems, Vol. 30 (2017), Curran Associates, Inc.), 1119-1129
[206] Shalev-Shwartz, S.; Zhang, T., Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization, (Xing, E. P.; Jebara, T., Proceedings of the 31st International Conference on Machine Learning, Vol. 32 of Proceedings of Machine Learning Research, PMLR, Bejing, China (2014)), 64-72
[207] Shapiro, A.; Dentcheva, D.; Ruszczyński, A., Lectures on stochastic programming, Society for Industrial and Applied Mathematics (2009) · Zbl 1183.90005
[208] Shefi, R.; Teboulle, M., Rate of convergence analysis of decomposition methods based on the proximal method of multipliers for convex minimization, SIAM Journal on Optimization, 24, 1, 269-297 (2014) · Zbl 1291.90176
[209] Shi, B.; Du, S. S.; Su, W.; Jordan, M. I., Acceleration via symplectic discretization of high-resolution differential equations, Advances in Neural Information Processing Systems, 5744-5752 (2019)
[210] Sorin, S., A First-Course on Zero-Sum Repeated Games (2000), Springer
[211] Stonyakin, F.; Gasnikov, A.; Dvurechensky, P.; Alkousa, M.; Titov, A., Generalized Mirror Prox for monotone variational inequalities: Universality and inexact oracle
[212] Stonyakin, F.; Tyurin, A.; Gasnikov, A.; Dvurechensky, P.; Agafonov, A.; Dvinskikh, D.; Alkousa, M.; Pasechnyuk, D.; Artamonov, S.; Piskunova, V., Inexact model: A framework for optimization and variational inequalities, Optimization Methods and Software (2020)
[213] Stonyakin, F. S.; Dvinskikh, D.; Dvurechensky, P.; Kroshnin, A.; Kuznetsova, O.; Agafonov, A.; Gasnikov, A.; Tyurin, A.; Uribe, C. A.; Pasechnyuk, D.; Artamonov, S., Gradient methods for problems with inexact model of the objective, (Khachay, M.; Kochetov, Y.; Pardalos, P., Mathematical Optimization Theory and Operations Research (2019), Springer International Publishing: Springer International Publishing Cham), 97-114 · Zbl 1437.90126
[214] Sturm, J. F., Using sedumi 1.02, a matlab toolbox for optimization over symmetric cones, Optimization methods and software, 11, 1-4, 625-653 (1999) · Zbl 0973.90526
[215] Su, W.; Boyd, S.; Candes, E. J., A differential equation for modeling nesterov’s accelerated gradient method: Theory and insights, Journal of Machine Learning Research (2016) · Zbl 1391.90667
[216] Sun, A. X.; Phan, D. T.; Ghosh, S., Fully decentralized ac optimal power flow algorithms, 2013 IEEE Power & Energy Society General Meeting, 1-5 (2013), IEEE
[217] Sun, R., Optimization for deep learning: theory and algorithms
[218] Teboulle, M., Entropic proximal mappings with applications to nonlinear programming, Mathematics of Operations Research, 17, 670-690 (1992) · Zbl 0766.90071
[219] Teboulle, M., A simplified view of first order methods for optimization, Mathematical Programming, 170, 1, 67-96 (2018) · Zbl 1391.90482
[220] Todd, M. J., Minimum-volume ellipsoids, Society for Industrial and Applied Mathematics (2016) · Zbl 1360.90006
[221] Tran-Dinh, Q.; Alacaoglu, A.; Fercoq, O.; Cevher, V., An adaptive primal-dual framework for nonsmooth convex minimization, Mathematical Programming Computation, 12, 3, 451-491 (2020) · Zbl 1452.90246
[222] Tran-Dinh, Q.; Cevher, V., Constrained convex minimization via model-based excessive gap, Proceedings of the 27th International Conference on Neural Information Processing Systems, NIPS’14, 721-729 (2014), MIT Press: MIT Press Cambridge, MA, USA
[223] Tran-Dinh, Q.; Fercoq, O.; Cevher, V., A smooth primal-dual optimization framework for nonsmooth composite convex minimization, SIAM Journal on Optimization, 28, 1, 96-134 (2018) · Zbl 1386.90109
[224] Tseng, P., Applications of a splitting algorithm to decomposition in convex programming and variational inequalities, SIAM Journal on Control and Optimization, 29, 1, 119-138 (1991) · Zbl 0737.90048
[225] Tseng, P., On accelerated proximal gradient methods for convex-concave optimization, Tech. rep. (2008), MIT
[226] Tupitsa, N.; Dvurechensky, P.; Gasnikov, A.; Uribe, C. A., Multimarginal optimal transport by accelerated alternating minimization, 2020 59th IEEE Conference on Decision and Control (CDC), 6132-6137 (2020)
[227] Uribe, C. A.; Dvinskikh, D.; Dvurechensky, P.; Gasnikov, A.; Nedić, A., Distributed computation of Wasserstein barycenters over networks, 2018 IEEE Conference on Decision and Control (CDC), 6544-6549 (2018)
[228] Van Nguyen, Q., Forward-backward splitting with bregman distances, Vietnam Journal of Mathematics, 45, 3, 519-539 (2017) · Zbl 1371.90106
[229] Von Hohenbalken, B., Simplicial decomposition in nonlinear programming algorithms, Mathematical Programming, 13, 1, 49-68 (1977) · Zbl 0362.90086
[230] Vorontsova, E. A.; Gasnikov, A. V.; Gorbunov, E. A., Accelerated directional search with non-euclidean prox-structure, Automation and Remote Control, 80, 4, 693-707 (2019) · Zbl 1434.90143
[231] Vorontsova, E. A.; Gasnikov, A. V.; Gorbunov, E. A.; Dvurechenskii, P. E., Accelerated gradient-free optimization methods with a non-euclidean proximal operator, Automation and Remote Control, 80, 8, 1487-1501 (2019) · Zbl 1434.90144
[232] Wibisono, A.; Wilson, A. C.; Jordan, M. I., A variational perspective on accelerated methods in optimization, Proceedings of the National Academy of Sciences, 113, 47, E7351 (2016) · Zbl 1404.90098
[233] Wolfe, P., Convergence theory in nonlinear programming, (Abadie, J., Integer and nonlinear programming (1970), North-Holland, Amsterdam) · Zbl 0336.90045
[234] Wright, S. J., Optimization algorithms for data analysis, The Mathematics of Data, 25, 49 (2018) · Zbl 1448.68018
[235] Yang, J.; Zhang, Y., Alternating direction algorithms for \(\smallsetminus\) ell_1-problems in compressive sensing, SIAM journal on scientific computing, 33, 1, 250-278 (2011) · Zbl 1256.65060
[236] Yuan, X., Alternating direction method for covariance selection models, Journal of Scientific Computing, 51, 2, 261-273 (2012) · Zbl 1255.65031
[237] Zhang, Y.; Xiao, L., Stochastic primal-dual coordinate method for regularized empirical risk minimization, (Bach, F.; Blei, D., Proceedings of the 32nd International Conference on Machine Learning, Vol. 37 of Proceedings of Machine Learning Research, PMLR, Lille, France (2015)), 353-361
[238] Zhao, R.; Freund, R. M., Analysis of the Frank-Wolfe method for logarithmically-homogeneous barriers, with an extension
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.