×

A portmanteau local feature discrimination approach to the classification with high-dimensional matrix-variate data. (English) Zbl 07730304

Summary: Matrix-variate data arise in many scientific fields such as face recognition, medical imaging, etc. Matrix data contain important structure information which can be ruined by vectorization. Methods incorporating the structure information into analysis have significant advantages over vectorization approaches. In this article, we consider the problem of two-class classification with high-dimensional matrix-variate data, and propose a novel portmanteau-local-feature discrimination (PLFD) method. This method first identifies local discrimination features of the matrix variate and then pools them together to construct a discrimination rule. We investigated the theoretical properties of the PLFD method and established its asymptotic optimality. We carried out extensive numerical studies including simulation and real data analysis to compare this method with other methods available in the literature, which demonstrate that the PLFD method has a great advantage over the other methods in terms of misclassification rate.

MSC:

62Hxx Multivariate analysis
62H30 Classification and discrimination; cluster analysis (statistical aspects)

Software:

sparcl
Full Text: DOI

References:

[1] Anderson, TW, An introduction to multivariate statistical analysis (2003), New York: Wiley, New York · Zbl 1039.62044
[2] Guo, J., Simultaneous variable selection and class fusion for high-dimensional linear discriminant analysis, Biostatistics, 11, 599-608 (2010) · Zbl 1437.62480 · doi:10.1093/biostatistics/kxq023
[3] Gupta, A.; Nagar, D., Matrix variate distributions. no. 104 in monographs and surveys in pure and applied mathematics (1999), Florida: Chapman & Hall/CRC, Florida
[4] Gupta, AK; Varga, T., Elliptically contoured models in statistics (1993), Berlin: Springer, Berlin · Zbl 0789.62037 · doi:10.1007/978-94-011-1646-6
[5] Hung, H.; Wang, CC, Matrix variate logistic regression model with application to eeg data, Biostatistics, 14, 189-202 (2013) · doi:10.1093/biostatistics/kxs023
[6] Koltchinskii, V.; Lounici, K.; Tsybakov, AB, Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion, Ann. Stat., 39, 2302-2329 (2011) · Zbl 1231.62097 · doi:10.1214/11-AOS894
[7] Lai, Z.; Xu, Y.; Yang, J.; Tang, J.; Zhang, D., Sparse tensor discriminant analysis, IEEE Trans. Image Process., 22, 3904-3915 (2013) · Zbl 1373.94218 · doi:10.1109/TIP.2013.2264678
[8] Li, B.; Kim, MK; Altman, N., On dimension folding of matrix- or array-valued statistical objects, Ann. Stat., 38, 1094-1121 (2010) · Zbl 1183.62091 · doi:10.1214/09-AOS737
[9] Li, M and Yuan, B (2005). 2d-lda: A statistical linear discriminant analysis for image matrix, 26, p. 527-532.
[10] Li, Q.; Schonfeld, D., Multilinear discriminant analysis for higher-order tensor data classification, IEEE Trans. Pattern Anal. Mach. Intell., 36, 2524-2537 (2014) · doi:10.1109/TPAMI.2014.2342214
[11] Luo, S.; Chen, Z., A procedure of linear discrimination analysis with detected sparsity structure for high-dimensional multi-class classification, J. Multivar. Anal., 179, 104641 (2020) · Zbl 1512.62063 · doi:10.1016/j.jmva.2020.104641
[12] Masulli, F. and Rovetta, S. (2015). Clustering high-dimensional data. In Clustering High-Dimensional Data. Springer, Berlin, pp 1-13. · Zbl 1103.68782
[13] Molstad, AJ; Rothman, AJ, A penalized likelihood method for classification with matrix-valued predictors, J. Comput. Graph. Stat., 28, 11-22 (2019) · Zbl 07499008 · doi:10.1080/10618600.2018.1476249
[14] Pan, Y.; Mai, Q.; Zhang, X., Covariate-adjusted tensor classification in high dimensions, J. Am. Stat. Assoc., 114, 1305-1319 (2019) · Zbl 1428.62291 · doi:10.1080/01621459.2018.1497500
[15] Recht, B.; Fazel, M.; Parrilo, PA, Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization, SIAM Rev., 52, 471-501 (2010) · Zbl 1198.90321 · doi:10.1137/070697835
[16] Shao, J., Wang, Y., Deng, X. and Wang, S. (2011). Sparse linear discriminant analysis by thresholding for high dimensional data. Ann. Stat. 1241-1265. · Zbl 1215.62062
[17] Witten, DM; Tibshirani, R., A framework for feature selection in clustering, J. Am. Stat. Assoc., 105, 713-726 (2010) · Zbl 1392.62194 · doi:10.1198/jasa.2010.tm09415
[18] Xu, Z. (2020). Sparse linear discriminant analysis for high dimensional gaussian matrix-valued predictors. PhD thesis, Shanghai Jiao Tong University.
[19] Zhang, XL; Begleiter, H.; Porjesz, B.; Wang, W.; Litke, A., Event related potentials during object recognition tasks, Brain Res. Bull., 38, 531-538 (1995) · doi:10.1016/0361-9230(95)02023-5
[20] Zheng, WS; Lai, JH; Li, SZ, 1d-lda vs. 2d-lda: When is vector-based linear discriminant analysis better than matrix-based?, Patt. Recogn., 41, 2156-2172 (2008) · Zbl 1138.68051 · doi:10.1016/j.patcog.2007.11.025
[21] Zhong, W.; Suslick, KS, Matrix discriminant analysis with application to colorimetric sensor array data, Technometrics A J. Sta. Phys. Chem. Eng. Sci., 57, 524 (2015)
[22] Zhou, H.; Li, L., Regularized matrix regression, J. R. Stat. Soc. Ser. B-stat. Methodol., 76, 463-483 (2014) · Zbl 07555458 · doi:10.1111/rssb.12031
[23] Zhou, H.; Li, L.; Zhu, H., Tensor regression with applications in neuroimaging data analysis, J. Am. Stat. Assoc., 108, 540-552 (2013) · Zbl 06195959 · doi:10.1080/01621459.2013.776499
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.