×

Pliable Lasso for the multinomial logistic regression. (English) Zbl 07533649

Summary: In this paper, we study the multinomial logistic regression with interactive effects. Our approach involves the implementation of the pliable lasso penalty which allows for estimating the main effects of the covariates \(X\) and an interaction effects between the covariates and a set modifiers \(Z\). The hierarchical penalty helps to avoid over-fitting by excluding the interaction effects when the corresponding main effects are zero. The original log-likelihood model is transformed into an iteratively reweighted least square problem with the pliable lasso penalty and then, the block-wise coordinate descent approach is employed. Our results show that the pliable lasso for multinomial logistic regression has some good qualities and can perform well in multi-classification problems which involve interactive variables.

MSC:

62-XX Statistics
Full Text: DOI

References:

[1] Calster, B. V.; Hoorde, K. V.; Vergouwe, Y.; Bobdiwala, S.; Condous, G.; Kirk, E.; Bourne, T.; Steyerberg, E. W., Validation and updating of risk models based on multinomial logistic regression, Diagnostic and Prognostic Research, 1, 1, 2 (2017) · doi:10.1186/s41512-016-0002-x
[2] Czepiel, S. A. (2002)
[3] Du, W., and Tibshirani, R.. 2018. A pliable lasso for the Cox model. arXiv:1807.06770.
[4] Dua, D.; Casey, G. (2019)
[5] Friedman, J.; Hastie, T.; Tibshirani, R., Regularization paths for generalized linear models via coordinate descent, Journal of Statistical Software, 33, 1, 1-22 (2010) · doi:10.18637/jss.v033.i01
[6] Greene, W. H., Econometric analysis (2003), Upper Saddle River, NY: Prentice Hall, Upper Saddle River, NY
[7] Gujarati, D. N., Basic econometrics (2004), New York, NY: McGraw-Hill Companies, New York, NY
[8] Gutkin, M.; Shamir, R.; Dror, G., Slimpls: A method for feature selection in gene expression-based disease classification, PLoS One, 4, 7, e6416 (2009) · doi:10.1371/journal.pone.0006416
[9] Hastie, T.; Tibshirani, R., Generalized additive models, Statistical Science, 1, 3, 314-18 (1986) · Zbl 0645.62068 · doi:10.1214/ss/1177013609
[10] Hastie, T.; Tibshirani, R.; Friedman, J., The elements of statistical learning: Data mining, inference, and prediction (2009), New York, NY: Springer, New York, NY · Zbl 1273.62005
[11] Hastie, T.; Tibshirani, R.; Wainwright, M., Statistical learning with sparsity: The lasso and generalizations (2015), New York, NY: Chapman & Hall/CRC · Zbl 1319.68003
[12] He, Y.; Chen, Z., The ebic and a sequential procedure for feature selection in interactive linear models with high-dimensional data, Annals of the Institute of Statistical Mathematics, 68, 1, 155-80 (2016) · Zbl 1440.62269 · doi:10.1007/s10463-014-0497-2
[13] Khan, J.; Wei, J. S.; Ringnér, M.; Saal, L. H.; Ladanyi, M.; Westermann, F.; Berthold, F.; Schwab, M.; Antonescu, C. R.; Peterson, C., Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks, Nature Medicine, 7, 6, 673-79 (2001) · doi:10.1038/89044
[14] Lee, S.-I.; Lee, H.; Abbeel, P.; Ng, A. Y., Efficient L_1 regularized logistic regression (2006)
[15] Liang, Y.; Liu, C.; Luan, X.-Z.; Leung, K.-S.; Chan, T.-M.; Xu, Z.-B.; Zhang, H., Sparse logistic regression with a \(####\) penalty for gene selection in cancer classification, BMC Bioinformatics, 14, 1 (2013) · doi:10.1186/1471-2105-14-198
[16] Lim, M.; Hastie, T., Learning interactions via hierarchical group-lasso regularization, Journal of Computational and Graphical Statistics, 24, 3, 627-54 (2015) · doi:10.1080/10618600.2014.938812
[17] McCullagh, P.; Frs, J. A. N., Generalized linear models (1983), London: Chapman and Hall, London · Zbl 0588.62104
[18] Mehra, N.; Gupta, S., Survey on multiclass classification methods, International Journal of Computer Science and Information Technologies, 4, 4 (2013)
[19] Nelder, J. A.; Wedderburn, R. W. M., Generalized linear models, Journal of the Royal Statistical Society. Series A, 135, 3 (1972)
[20] Obuchi, T.; Kabashim, Y., Accelerating cross-validation in multinomial logistic regression with L_1 -regularization, Journal of Machine Learning Research, 19 (2018) · Zbl 1467.62128
[21] Park, H.-A., An introduction to logistic regression: From basic concepts to interpretation with particular attention to nursing domain, Journal of Korean Academy of Nursing, 43, 2, 154-64 (2013) · doi:10.4040/jkan.2013.43.2.154
[22] Ramaswamy, S.; Tamayo, P.; Rifkin, R.; Mukherjee, S.; Yeang, C.-H.; Angelo, M.; Ladd, C.; Reich, M.; Latulippe, E.; Mesirov, J. P., Multiclass cancer diagnosis using tumor gene expression signatures, Proceedings of the National Academy of Sciences of the United States of America, 98, 26, 15149-54 (2001) · doi:10.1073/pnas.211566398
[23] Sørlie, T.; Perou, C. M.; Tibshirani, R.; Aas, T.; Geisler, S.; Johnsen, H.; Hastie, T.; Eisen, M. B.; van de Rijn, M.; Jeffrey, S. S., Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications, Proceedings of the National Academy of Sciences of the United States of America, 98, 19, 10869-74 (2001) · doi:10.1073/pnas.191367098
[24] Tibshirani, R., Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society. Series B, 58, 1 (1996) · Zbl 0850.62538
[25] Tibshirani, R., and Friedman, J.. 2020. A pliable lasso. Journal of Computational and Graphical Statistics 29 (1):215-25. doi:. · Zbl 07499284
[26] Zhu, J.; Hastie, T., Classification of gene microarrays by penalized logistic regression, Biostatistics, 5, 3, 427-43 (2004) · Zbl 1154.62406 · doi:10.1093/biostatistics/kxg046
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.