×

Testing the differential network between two Gaussian graphical models with false discovery rate control. (English) Zbl 07862340

Summary: This paper focuses on the differential network analysis between two Gaussian graphical models (GGMs). We introduce a new framework for inferring the structural differences between two GGMs by adopting the elegant symmetrized data aggregation (SDA) which proceeds by sample splitting, data screening, and information aggregation to achieve the false discovery rate (FDR) control. The theoretical guarantee for the FDR control is established to verify the validity of this procedure. Simulation studies show that the proposed method delivers a reasonable FDR control with remarkable power. The method is applied to a TCGA breast cancer data set to identify the gene network rewiring between the luminal A and basal-like subtypes.

MSC:

62-XX Statistics

Software:

glasso; huge
Full Text: DOI

References:

[1] Lauritzen, SL.Graphical models. Vol. 17. Oxford: Clarendon Press; 1996. · Zbl 0907.62001
[2] Meinshausen, N, Bühlmann, P.High-dimensional graphs and variable selection with the lasso. Ann Stat. 2006;34(3):1436-1462. doi: · Zbl 1113.62082
[3] Yuan, M.High dimensional inverse covariance matrix estimation via linear programming. J Mach Learn Res. 2010;11:2261-2286. · Zbl 1242.62043
[4] Sun, T, Zhang, CH.Sparse matrix inversion with scaled lasso. J Mach Learn Res. 2013;14(1):3385-3418. · Zbl 1318.62184
[5] Cai, T, Liu, W, Luo, X.A constrained \(\ell_1\) minimization approach to sparse precision matrix estimation. J Am Stat Assoc. 2011;106(494):594-607. doi: · Zbl 1232.62087
[6] Yuan, M, Lin, Y.Model selection and estimation in the gaussian graphical model. Biometrika. 2007;94(1):19-35. doi: · Zbl 1142.62408
[7] Friedman, J, Hastie, T, Tibshirani, R.Sparse inverse covariance estimation with the graphical lasso. Biostatistics. 2008;9(3):432-441. doi: · Zbl 1143.62076
[8] Guo, J, Levina, E, Michailidis, G, et al. Joint estimation of multiple graphical models. Biometrika. 2011;98(1):1-15. doi: · Zbl 1214.62058
[9] Mohan, K, Chung, M, Han, S, et al. Structured learning of gaussian graphical models. Adv Neural Inf Process Syst. 2012;25:1-9.
[10] Mohan, K, London, P, Fazel, M, et al. Node-based learning of multiple gaussian graphical models. The Journal of Machine Learning Research. 2014;15(1):445-488. · Zbl 1318.62181
[11] Danaher, P, Wang, P, Witten, DM.The joint graphical lasso for inverse covariance estimation across multiple classes. J R Stat Soc Ser B (Stat Methodol). 2014;76(2):373-397. doi: · Zbl 07555455
[12] Yang, S, Lu, Z, Shen, X, et al. Fused multiple graphical lasso. SIAM J Optim. 2015;25(2):916-943. doi: · Zbl 1320.90055
[13] Saegusa, T, Shojaie, A.Joint estimation of precision matrices in heterogeneous populations. Electron J Stat. 2016;10(1):1341. doi: · Zbl 1341.62130
[14] Gibberd, AJ, Nelson, JD.Regularized estimation of piecewise constant gaussian graphical models: the group-fused graphical lasso. J Comput Graph Stat. 2017;26(3):623-634. doi:
[15] Shan, L, Kim, I.Joint estimation of multiple gaussian graphical models across unbalanced classes. Comput Stat Data Anal. 2018;121:89-103. doi: · Zbl 1469.62140
[16] Shan, L, Qiao, Z, Cheng, L, et al. Joint estimation of the two-level gaussian graphical models across multiple classes. J Comput Graph Stat. 2020;29(3):562-579. doi: · Zbl 07499297
[17] Bilgrau, AE, Peeters, CF, Eriksen, PS, et al. Targeted fused ridge estimation of inverse covariance matrices from multiple high-dimensional data classes. J Mach Learn Res. 2020;21(26):1-52. · Zbl 1499.62177
[18] Price, BS, Molstad, AJ, Sherwood, B.Estimating multiple precision matrices with cluster fusion regularization. J Comput Graph Stat. 2021;30(4):823-834. doi: · Zbl 07499920
[19] Zhao, SD, Cai, TT, Li, H.Direct estimation of differential networks. Biometrika. 2014;101(2):253-268. doi: · Zbl 1452.62865
[20] Yuan, H, Xi, R, Chen, C, et al. Differential network analysis via lasso penalized d-trace loss. Biometrika. 2017;104(4):755-770. doi: · Zbl 07072326
[21] Shojaie, A.Differential network analysis: a statistical perspective. Wiley Interdisci Rev Comput Stat. 2021;13(2):e1508. · Zbl 07910733
[22] Tsai, K, Koyejo, O, Kolar, M.Joint gaussian graphical model estimation: a survey. Wiley Interdisci Rev Comput Stat. 2022;14:e1582. doi: · Zbl 07910990
[23] Liu, W.Structural similarity and difference testing on multiple sparse gaussian graphical models. Ann Stat. 2017;45(6):2680-2707. doi: · Zbl 1486.62178
[24] Benjamini, Y, Hochberg, Y.Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodol). 1995;57(1):289-300. · Zbl 0809.62014
[25] Liu, W.Gaussian graphical model estimation with false discovery rate control. Ann Stat. 2013;41(6):2948-2978. doi: · Zbl 1288.62094
[26] Chen, X, Liu, W.Graph estimation for matrix-variate gaussian data. Stat Sin. 2019;29(1):479-504. · Zbl 1412.62103
[27] He, Y, Zhang, X, Wang, P, et al. High dimensional gaussian copula graphical model with fdr control. Comput Stat Data Anal. 2017;113:457-474. doi: · Zbl 1464.62089
[28] Xia, Y, Li, L.Hypothesis testing of matrix graph model with application to brain connectivity analysis. Biometrics. 2017;73(3):780-791. doi: · Zbl 1522.62263
[29] Xia, Y, Cai, T, Cai, TT.Testing differential networks with applications to the detection of gene-gene interactions. Biometrika. 2015;102(2):247-266. doi: · Zbl 1452.62392
[30] Xia, Y, Li, L.Matrix graph hypothesis testing and application in brain connectivity alternation detection. Stat Sin. 2019;29(1):303-328. · Zbl 1412.62187
[31] Du, L, Guo, X, Sun, W, et al. False discovery rate control under general dependence by symmetrized data aggregation. J Am Stat Assoc. 2021;118(541):607-621. · Zbl 1514.62143
[32] Anderson, T.An introduction to multivariate statistical analysis. Hoboken, New Jersey: John Wiley & Sons; 2003. · Zbl 1039.62044
[33] Tibshirani, R.Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol). 1996;58(1):267-288. · Zbl 0850.62538
[34] Friedman, J, Hastie, T, Tibshirani, R.Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33(1):1. doi:
[35] Meinshausen, N, Meier, L, Bühlmann, P.p-values for high-dimensional regression. J Am Stat Assoc. 2009;104(488):1671-1681. doi: · Zbl 1205.62089
[36] van de Geer, SA, Bühlmann, P.On the conditions used to prove oracle results for the lasso. Electron J Stat. 2009;3:1360-1392. doi: · Zbl 1327.62425
[37] Bühlmann, P, Mandozzi, J.High-dimensional variable screening and bias in subsequent inference, with an empirical comparison. Comput Stat. 2014;29(3):407-430. doi: · Zbl 1306.65035
[38] Wasserman, L, Roeder, K.High dimensional variable selection. Ann Stat. 2009;37(5A):2178. doi: · Zbl 1173.62054
[39] Liu, W, Shao, QM.Phase transition and regularized bootstrap in large-scale t-tests with false discovery rate control. Ann Stat. 2014;42(5):2003-2025. doi: · Zbl 1305.62213
[40] Zhao, T, Liu, H, Roeder, K, et al. The huge package for high-dimensional undirected graph estimation in r. J Mach Learn Res. 2012;13(1):1059-1062. · Zbl 1283.68311
[41] Schnitt, SJ.Classification and prognosis of invasive breast cancer: from morphology to molecular taxonomy. Mod Pathol. 2010;23(2):S60-S64. doi:
[42] Koboldt, D, Fulton, R, McLellan, M, et al. Comprehensive molecular portraits of human breast tumours. Nature. 2012;490(7418):61-70. doi:
[43] Santen, RJ, Song, RX, McPherson, R, et al. The role of mitogen-activated protein (map) kinase in breast cancer. J Steroid Biochem Mol Biol. 2002;80(2):239-256. doi:
[44] Downward, J.Targeting ras signalling pathways in cancer therapy. Nat Rev Cancer. 2003;3(1):11-22. doi:
[45] Klaus, A, Birchmeier, W.Wnt signalling and its impact on development and cancer. Nat Rev Cancer. 2008;8(5):387-398. doi:
[46] Osaki, M, Oshimura, M, et al. Pi3k-akt pathway: its functions and alterations in human cancer. Apoptosis. 2004;9:667-676. doi:
[47] Wada, T, Penninger, JM.Mitogen-activated protein kinases in apoptosis regulation. Oncogene. 2004;23(16):2838-2849. doi:
[48] Bonin, S, Pracella, D, Barbazza, R, et al. Pi3k/akt signaling in breast cancer molecular subtyping and lymph node involvement. Dis Markers. 2019;2019:1-13. doi:
[49] Liu, H, Lafferty, J, Wasserman, L.The nonparanormal: semiparametric estimation of high dimensional undirected graphs. J Mach Learn Res. 2009;10(10):2295-2328. · Zbl 1235.62035
[50] Liu, H, Han, F, Yuan, M, et al. High-dimensional semiparametric gaussian copula graphical models. Ann Stat. 2012;40(4):2293-2326. doi: · Zbl 1297.62073
[51] Xue, L, Zou, H.Regularized rank-based estimation of high-dimensional nonparanormal graphical models. Ann Stat. 2012;40(5):2541-2571. doi: · Zbl 1373.62138
[52] Cai, T, Liu, W, Xia, Y.Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings. J Am Stat Assoc. 2013;108(501):265-277. doi: · Zbl 06158341
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.