×

Partial least squares for heterogeneous data. (English) Zbl 1366.62131

Abdi, Hervé (ed.) et al., The multiple facets of partial least squares methods. PLS, Paris, France, May 26–28, 2014. Cham: Springer (ISBN 978-3-319-40641-1/hbk; 978-3-319-40643-5/ebook). Springer Proceedings in Mathematics & Statistics 173, 3-15 (2016).
Summary: Large-scale data, where the sample size and the dimension are high, often exhibits heterogeneity. This can arise for example in the form of unknown subgroups or clusters, batch effects or contaminated samples. Ignoring these issues would often lead to poor prediction and estimation. We advocate the maximin effects framework [N. Meinshausen and the author, Ann. Stat. 43, No. 4, 1801–1830 (2015; Zbl 1317.62059)] to address the problem of heterogeneous data. In combination with partial least squares (PLS) regression, we obtain a new PLS procedure which is robust and tailored for large-scale heterogeneous data. A small empirical study complements our exposition of new PLS methodology.
For the entire collection see [Zbl 1356.62003].

MSC:

62J05 Linear regression; mixed models
62C20 Minimax procedures in statistical decision theory

Citations:

Zbl 1317.62059
Full Text: DOI

References:

[1] Breiman, L.: ”Bagging predictors. Mach. Learn. 24, 123–140 (1996) · Zbl 0858.68080
[2] Bühlmann, P., Meinshausen, N.: Magging: maximin aggregation for inhomogeneous large-scale data. Proc. of the IEEE 104, 126–135 (2016) · doi:10.1109/JPROC.2015.2494161
[3] Bühlmann, P., van de Geer, S.: Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, New York (2011) · Zbl 1273.62015 · doi:10.1007/978-3-642-20192-9
[4] Bühlmann, P., Yu, B.: Analyzing bagging. Ann. Stat. 30, 927–961 (2002) · Zbl 1029.62037 · doi:10.1214/aos/1031689014
[5] Esposito Vinzi, V., Chin, W.W., Henseler, J., Wang, H.: Handbook of Partial Least Squares: Concepts, Methods and Applications. Springer, New York (2010) · Zbl 1186.62001 · doi:10.1007/978-3-540-32827-8
[6] Frank, L.E., Friedman, J.H.: A statistical view of some chemometrics regression tools. Technometrics 35 (2), 109–135 (1993) · Zbl 0775.62288 · doi:10.1080/00401706.1993.10485033
[7] Geladi, P., Kowalski, B.R.: Partial least-squares regression: a tutorial. Analytica Chimica Acta 185, 1–17 (1986) · doi:10.1016/0003-2670(86)80028-9
[8] Hall, P., Samworth, R.J.: Properties of bagged nearest neighbour classifiers. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67, 363–379 (2005) · Zbl 1069.62051 · doi:10.1111/j.1467-9868.2005.00506.x
[9] Hastie, T., Tibshirani, R.: Varying-coefficient models. J. R. Stat. Soc. Ser. B (Statist. Methodol.) 55, 757–796 (1993) · Zbl 0796.62060
[10] Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, Data Mining, Inference and Prediction, 2nd edn. Springer, New York (2009) · Zbl 1273.62005
[11] Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12, 55–67 (1970) · Zbl 0202.17205 · doi:10.1080/00401706.1970.10488634
[12] Huber, P.J.: Robust Statistics, 2nd edn. Springer, New York (2011)
[13] McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, New York (2004) · Zbl 0963.62061
[14] Meinshausen, N., Bühlmann, P.: Maximin effects in inhomogeneous large-scale data. Ann. Statist. 43, 1801–1830 (2015) · Zbl 1317.62059 · doi:10.1214/15-AOS1325
[15] Pinheiro, J., Bates, D.: Mixed-Effects Models in S and S-PLUS. Springer, New York (2000) · Zbl 0953.62065 · doi:10.1007/978-1-4419-0318-1
[16] R Core Team: R: a language and environment for statistical computing. R foundation for statistical computing, Vienna. http://www.R-project.org (2014)
[17] Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B (Statist. Methodol.) 58, 267–288 (1996) · Zbl 0850.62538
[18] Wold, H.: Estimation of principal components and related models by iterative least squares. In: Krishnaiaah, P. (ed.) Multivariate Analysis, pp. 391–420. Academic, New York (1966) · Zbl 0214.46103
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.