×

pISTA: preconditioned iterative soft thresholding algorithm for graphical LASSO. (English) Zbl 07837060

Summary: We propose a novel quasi-Newton method for solving the sparse inverse covariance estimation problem also known as the graphical least absolute shrinkage and selection operator (GLASSO). This problem is often solved using a second-order quadratic approximation. However, in such algorithms the Hessian term is complex and computationally expensive to handle. Therefore, our method uses the inverse of the Hessian as a preconditioner to simplify and approximate the quadratic element at the cost of a more complex \(\ell_1\) element. The variables of the resulting preconditioned problem are coupled only by the \(\ell_1\) subderivative of each other, which can be guessed with minimal cost using the gradient itself, allowing the algorithm to be parallelized and implemented efficiently on GPU hardware accelerators. Numerical results on synthetic and real data demonstrate that our method is competitive with other state-of-the-art approaches.

MSC:

90C25 Convex programming
65D18 Numerical aspects of computer graphics, image analysis, and computational geometry
65K10 Numerical optimization and variational techniques
65F08 Preconditioners for iterative methods

Software:

QUIC; GMRFLib; glasso; HdBCS

References:

[1] Banerjee, O., El Ghaoui, L., and d’Aspremont, A., Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data, J. Mach. Learn. Res., 9 (2008), pp. 485-516. · Zbl 1225.68149
[2] Banerjee, O., Ghaoui, L. E., d’Aspremont, A., and Natsoulis, G., Convex optimization techniques for fitting sparse Gaussian graphical models, in Proceedings of the 23rd International Conference on Machine Learning, , 2006, pp. 89-96.
[3] Beck, A. and Teboulle, M., A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM J. Imaging Sci., 2 (2009), pp. 183-202, doi:10.1137/080716542. · Zbl 1175.94009
[4] Bollhöfer, M., Eftekhari, A., Scheidegger, S., and Schenk, O., Large-scale sparse inverse covariance matrix estimation, SIAM J. Sci. Comput., 41 (2019), pp. A380-A401, doi:10.1137/17M1147615. · Zbl 1414.65043
[5] Boyd, S. and Vandenberghe, L., Convex Optimization, Cambridge University Press, 2004. · Zbl 1058.90049
[6] Chen, X., Liu, Y., Liu, H., and Carbonell, J., Learning spatial-temporal varying graphs with applications to climate data analysis, in AAAI Conference on Artificial Intelligence, , AAAI, 2010, pp. 425-430.
[7] d’Aspremont, A., Banerjee, O., and El Ghaoui, L., First-order methods for sparse covariance selection, SIAM J. Matrix Anal. Appl., 30 (2008), pp. 56-66, doi:10.1137/060670985. · Zbl 1156.90423
[8] Dempster, A. P., Covariance selection, Biometrics, 28 (1972), pp. 157-175.
[9] Dobra, A., Hans, C., Jones, M., Nevins, J., Yao, G., and West, M., Sparse graphical models for exploring gene expression data, J. Multivariate Anal., 90 (2004), pp. 196-212. · Zbl 1047.62104
[10] Duchi, J., Gould, S., and Koller, D., Projected subgradient methods for learning sparse Gaussians, in Uncertainty in Artificial Intelligence, AUAI Press, 2008, pp. 153-160.
[11] Fan, J., Zhang, J., and Yu, K., Vast portfolio selection with gross-exposure constraints, J. Amer. Statist. Assoc., 107 (2012), pp. 592-606. · Zbl 1261.62091
[12] Finder, S. E., Treister, E., and Freifeld, O., Effective learning of a GMRF mixture model, IEEE Access, 10 (2022), pp. 7289-7299.
[13] Friedman, J., Hastie, T., and Tibshirani, R., Sparse inverse covariance estimation with the graphical lasso, Biostatistics, 9 (2008), pp. 432-441. · Zbl 1143.62076
[14] Goldenberg, A. and Moore, A. W., Bayes net graphs to understand co-authorship networks?, in Proceedings of the 3rd International Workshop on Link Discovery, , ACM, 2005, pp. 1-8.
[15] Guillot, D., Rajaratnam, B., Rolfs, B., Maleki, A., and Wong, I., Iterative thresholding algorithm for sparse inverse covariance estimation, in Advances in Neural Information Processing Systems, Vol. 25, Curran Associates, 2012.
[16] Hastie, T., Tibshirani, R., and Wainwright, M., Statistical Learning with Sparsity: The Lasso and Generalizations, Chapman & Hall/CRC, 2015. · Zbl 1319.68003
[17] Hsieh, C.-J., Sustik, M. A., Dhillon, I. S., and Ravikumar, P., QUIC: Quadratic approximation for sparse inverse covariance estimation, J. Mach. Learn. Res., 15 (2014), pp. 2911-2947. · Zbl 1319.65048
[18] Hsieh, C.-J., Sustik, M. A., Dhillon, I. S., Ravikumar, P. K., and Poldrack, R., BIG & QUIC: Sparse inverse covariance estimation for a million variables, in Neural Information Processing Systems, Vol. 26, Curran Associates, 2013, pp. 3165-3173.
[19] Idé, T., Lozano, A. C., Abe, N., and Liu, Y., Proximity-based anomaly detection using sparse structure learning, in Proceedings of the 2009 SIAM International Conference on Data Mining (SDM), , SIAM, 2009, pp. 97-108.
[20] Lauritzen, S., Graphical Models, , Clarendon Press, 1996. · Zbl 0907.62001
[21] Li, L. and Toh, K.-C., An inexact interior point method for \(L_1\)-regularized sparse covariance selection, Math. Program. Comput., 2 (2010), pp. 291-315. · Zbl 1208.90131
[22] Lu, Z., Smooth optimization approach for sparse covariance selection, SIAM J. Optim., 19 (2009), pp. 1807-1827, doi:10.1137/070695915. · Zbl 1179.90257
[23] Olsen, P. A., Oztoprak, F., Nocedal, J., and Rennie, S. J., Newton-like methods for sparse inverse covariance estimation, in Neural Information Processing Systems, Vol. 25, Curran Associates, 2012, pp. 755-763.
[24] Rue, H. and Held, L., Gaussian Markov Random Fields: Theory and Applications, CRC Press, 2005. · Zbl 1093.60003
[25] Scheinberg, K., Ma, S., and Goldfarb, D., Sparse inverse covariance selection via alternating linearization methods, in Neural Information Processing Systems, Vol. 23, Curran Associates, 2010, pp. 2101-2109.
[26] Tibshirani, R., Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser. B. Stat. Methodol., (1996), pp. 267-288. · Zbl 0850.62538
[27] Treister, E. and Turek, J. S., A block-coordinate descent approach for large-scale sparse inverse covariance estimation, in Neural Information Processing Systems (NIPS), Vol. 27, Curran Associates, 2014.
[28] Treister, E., Turek, J. S., and Yavneh, I., A multilevel framework for sparse optimization with application to inverse covariance estimation and logistic regression, SIAM J. Sci. Comput., 38 (2016), pp. S566-S592, doi:10.1137/15M102469X. · Zbl 1348.90466
[29] Treister, E. and Yavneh, I., A multilevel iterated-shrinkage approach to \(l_1\) penalized least-squares minimization, IEEE Trans. Signal Process., 60 (2012), pp. 6319-6329. · Zbl 1393.94463
[30] Wright, S. J., Nowak, R. D., and Figueiredo, M. A. T., Sparse reconstruction by separable approximation, IEEE Trans. Signal Process., 57 (2009), pp. 2479-2493. · Zbl 1391.94442
[31] Xuan, X. and Murphy, K., Modeling changing dependency structure in multivariate time series, in Proceedings of the 24th International Conference on Machine Learning, , ACM, 2007, pp. 1055-1062.
[32] Yuan, M. and Lin, Y., Model selection and estimation in the Gaussian graphical model, Biometrika, 94 (2007), pp. 19-35. · Zbl 1142.62408
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.