×

When random initializations help: a study of variational inference for community detection. (English) Zbl 1539.62170

Summary: Variational approximation has been widely used in large-scale Bayesian inference recently, the simplest kind of which involves imposing a mean field assumption to approximate complicated latent structures. Despite the computational scalability of mean field, theoretical studies of its loss function surface and the convergence behavior of iterative updates for optimizing the loss are far from complete. In this paper, we focus on the problem of community detection for a simple two-class Stochastic Blockmodel (SBM) with equal class sizes. Using batch co-ordinate ascent (BCAVI) for updates, we show different convergence behavior with respect to different initializations. When the parameters are known or estimated within a reasonable range and held fixed, we characterize conditions under which an initialization can converge to the ground truth. On the other hand, when the parameters need to be estimated iteratively, a random initialization will converge to an uninformative local optimum.

MSC:

62H22 Probabilistic graphical models
62F15 Bayesian inference
62H30 Classification and discrimination; cluster analysis (statistical aspects)

References:

[1] Pranjal Awasthi and Andrej Risteski. On some provably correct cases of variational inference for topic models. InAdvances in Neural Information Processing Systems, pages 2098-2106, 2015.
[2] Peter Bickel, David Choi, Xiangyu Chang, and Hai Zhang. Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels.The Annals of Statistics, pages 1922-1943, 2013. · Zbl 1292.62042
[3] David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent Dirichlet allocation.J. Mach. Learn. Res., 3:993-1022, March 2003. ISSN 1532-4435. URLhttp://dl.acm.org/ · Zbl 1112.68379
[4] David M. Blei, Alp Kucukelbir, and Jon D. McAuliffe. Variational inference: A review for statisticians.Journal of the American Statistical Association, 112(518):859-877, 2017.
[5] T Tony Cai, Xiaodong Li, et al. Robust and computationally feasible community detection in the presence of arbitrary outlier nodes.The Annals of Statistics, 43(3):1027-1059, 2015. · Zbl 1328.62381
[6] Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. Maximum likelihood from incomplete data via the EM algorithm.Journal of the royal statistical society. Series B · Zbl 0364.62022
[7] Paul Erdös. On a lemma of littlewood and offord.Bulletin of the American Mathematical Society, 51(12):898-902, 1945. · Zbl 0063.01270
[8] Behrooz Ghorbani, Hamid Javadi, and Andrea Montanari. An instability in variational inference for topic models.arXiv preprint arXiv:1802.00568, 2018.
[9] Ryan Giordano, Tamara Broderick, and Michael I Jordan. Covariances, robustness and variational bayes.The Journal of Machine Learning Research, 19(1):1981-2029, 2018. · Zbl 1467.62043
[10] Jake M. Hofman and Chris H. Wiggins. Bayesian approach to network modularity.Physical review letters, 100(25):258701, 2008.
[11] Paul W Holland, Kathryn Blackmond Laskey, and Samuel Leinhardt. Stochastic blockmodels: First steps.Social networks, 5(2):109-137, 1983.
[12] Tommi S. Jaakkola and Michael I. Jordon. Improving the mean field approximation via the use of mixture distributions. InLearning in Graphical Models, pages 163-173. MIT Press, Cambridge, MA, USA, 1999. ISBN 0-262-60032-3. URLhttp://dl.acm.org/citation. cfm?id=308574.308663. · Zbl 0953.60100
[13] Chi Jin, Yuchen Zhang, Sivaraman Balakrishnan, Martin J. Wainwright, and Michael I. Jordan. Local maxima in the likelihood of gaussian mixture models: Structural results and algorithmic consequences. InAdvances in Neural Information Processing Systems, pages 4116-4124, 2016.
[14] Michael I. Jordan, Zoubin Ghahramani, Tommi S. Jaakkola, and Lawrence K. Saul. An introduction to variational methods for graphical models.Mach. Learn., 37(2):183-233, November 1999. ISSN 0885-6125. doi: 10.1023/A:1007665907178. URLhttps://doi. org/10.1023/A:1007665907178. · Zbl 0945.68164
[15] Jing Lei, Alessandro Rinaldo, et al. Consistency of spectral clustering in stochastic block models.The Annals of Statistics, 43(1):215-237, 2015. · Zbl 1308.62041
[16] Xiaodong Li, Yudong Chen, and Jiaming Xu. Convex relaxation methods for community detection.arXiv preprint arXiv:1810.00315, 2018. · Zbl 1410.62105
[17] Song Mei, Yu Bai, Andrea Montanari, et al. The landscape of empirical risk for nonconvex losses.The Annals of Statistics, 46(6A):2747-2774, 2018. · Zbl 1409.62117
[18] Soumendu Sundar Mukherjee, Purnamrita Sarkar, YX Rachel Wang, and Bowei Yan. Mean field for the stochastic blockmodel: optimization landscape and convergence issues. In
[19] Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. On spectral clustering: Analysis and an algorithm. InAdvances in neural information processing systems, pages 849-856, 2002.
[20] Debdeep Pati, Anirban Bhattacharya, and Yun Yang. On statistical optimality of variational Bayes. InAISTATS, 2018. · Zbl 1450.62031
[21] Karl Rohe, Sourav Chatterjee, and Bin Yu. Spectral clustering and the high-dimensional stochastic blockmodel.The Annals of Statistics, pages 1878-1915, 2011. · Zbl 1227.62042
[22] Flemming Topsøe. Some bounds for the logarithmic function.Research report collection, 7 (2), 2004. URLhttp://vuir.vu.edu.au/17162/. · Zbl 0993.94005
[23] Martin J. Wainwright and Michael I. Jordan. Graphical models, exponential families, and variational inference.Found. Trends Mach. Learn., 1(1-2):1-305, January 2008. ISSN 1935-8237. doi: 10.1561/2200000001. URLhttp://dx.doi.org/10.1561/2200000001. · Zbl 1193.62107
[24] Bo Wang and D. M. Titterington. Convergence and asymptotic normality of variational Bayesian approximations for exponential family models with missing values. InProceedings
[25] Bo Wang and D. M. Titterington.Inadequacy of interval estimates corresponding to variational Bayesian approximations. InAISTATS, 2005.
[26] Bo Wang and D. M. Titterington.Convergence properties of a general algorithm for calculating variational Bayesian estimates for a normal mixture model.Bayesian Analysis, 1(3):625-650, 2006. · Zbl 1331.62168
[27] Yixin Wang and David M Blei. Frequentist consistency of variational bayes.Journal of the American Statistical Association, 114(527):1147-1161, 2019. · Zbl 1428.62119
[28] T Westling and TH McCormick.Beyond prediction: A framework for inference with variational approximations in mixture models.Journal of Computational and Graphical Statistics, pages 1-12, 2019.
[29] Burton Wu, Clare A McGrory, and Anthony N Pettitt. A new variational bayesian algorithm with application to human mobility pattern modeling.Statistics and Computing, 22(1): 185-203, 2012. · Zbl 1322.62045
[30] Ji Xu, Daniel J. Hsu, and Arian Maleki. Global analysis of expectation maximization for mixtures of two gaussians. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors,Advances in Neural Information Processing Systems 29, pages 2676-2684. Curran Associates, Inc., 2016.
[31] Anderson Y. Zhang and Harrison H. Zhou. Theoretical and computational guarantees of mean field variational inference for community detection.arXiv preprint arXiv:1710.11268, 2017. · Zbl 1462.62221
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.