
Empirical risk minimization under random censorship. (English) Zbl 07625158

Summary: We consider the classic supervised learning problem where a continuous non-negative random label \(Y\) (e.g. a random duration) is to be predicted based upon observing a random vector \(X\) valued in \(\mathbb{R}^d\) with \(d\geq 1\) by means of a regression rule with minimum least square error. In various applications, ranging from industrial quality control to public health through credit risk analysis for instance, training observations can be right censored, meaning that, rather than on independent copies of \((X,Y)\), statistical learning relies on a collection of \(n\geq 1\) independent realizations of the triplet \((X, \; \min\{Y,\; C\},\; \delta)\), where \(C\) is a nonnegative random variable with unknown distribution, modelling censoring and \(\delta=\mathbb{I}\{Y\leq C\}\) indicates whether the duration is right censored or not. As ignoring censoring in the risk computation may clearly lead to a severe underestimation of the target duration and jeopardize prediction, we consider a plug-in estimate of the true risk based on a Kaplan-Meier estimator of the conditional survival function of the censoring \(C\) given \(X\), referred to as Beran risk, in order to perform empirical risk minimization. It is established, under mild conditions, that the learning rate of minimizers of this biased/weighted empirical risk functional is of order \(O_{\mathbb{P}}(\sqrt{\log(n)/n})\) when ignoring model bias issues inherent to plug-in estimation, as can be attained in absence of censoring. Beyond theoretical results, numerical experiments are presented in order to illustrate the relevance of the approach developed.


68T05 Learning and adaptive systems in artificial intelligence


[1] P. K. Andersen, Ø. Borgan, R. D. Gill, and N. Keiding.Statistical Models Based on Counting Processes. Springer Series in Statistics. Springer US, New York, NY, 1993. ISBN · Zbl 0769.62061
[2] G. Ausset, F. Portier, and S. Cl´emen¸con. Machine Learning for Survival Analysis: Empirical Risk Minimization for Censored Distribution Free Regression with Applications. In
[3] H. Bang and A. A. Tsiatis. Median Regression with Censored Cost Data.Biometrics, 58(3): 643-649, 2002. ISSN 0006-341X. · Zbl 1210.62041
[4] P. L. Bartlett, O. Bousquet, and S. Mendelson. Local Rademacher complexities.Annals of Statistics, 33(4):1497-1537, Aug. 2005. ISSN 0090-5364, 2168-8966. doi: 10.1214/009053 605000000282. · Zbl 1083.62034
[5] R. Beran. Nonparametric regression with randomly censored survival data. Technical report, 1981.
[6] J. Buckley and I. James. Linear Regression with Censored Data.Biometrika, 66(3):429-436, 1979. ISSN 0006-3444. doi: 10.2307/2335161. · Zbl 0425.62051
[7] S. Cl´emen¸con and F. Portier. Beating Monte Carlo Integration: A Nonasymptotic Study of Kernel Smoothing Methods. InInternational Conference on Artificial Intelligence and Statistics, pages 548-556. PMLR, Mar. 2018.
[8] S. Cl´emen¸con, G. Lugosi, and N. Vayatis. Ranking and Empirical Minimization of Ustatistics.Annals of Statistics, 36(2):844-874, Apr. 2008. ISSN 0090-5364, 2168-8966. doi: 10.1214/009052607000000910. · Zbl 1181.68160
[9] D. R. Cox. Regression Models and Life-Tables.Journal of the Royal Statistical Society. Series B (Methodological), 34(2):187-220, 1972. ISSN 0035-9246. · Zbl 0243.62041
[10] D. R. Cox and D. Oakes.Analysis of Survival Data. Chapman and Hall/CRC, Boca Raton, 1984. ISBN 978-1-315-13743-8. doi: 10.1201/9781315137438.
[11] D. M. Dabrowska. Uniform Consistency of the Kernel Conditional Kaplan-Meier Estimate. Annals of Statistics, 17(3):1157-1167, Sept. 1989. ISSN 0090-5364, 2168-8966. doi: · Zbl 0687.62035
[12] V. de la Pe˜na and E. Gin´e.Decoupling: From Dependence to Independence. Probability and Its Applications. Springer-Verlag, New York, 1999. ISBN 978-0-387-98616-6. doi: 10.1007/978-1-4612-0537-1.
[13] B. Delyon and F. Portier. Integral approximation by kernel smoothing.Bernoulli, 22(4): 2177-2208, Nov. 2016. ISSN 1350-7265. doi: 10.3150/15-BEJ725. · Zbl 1345.60013
[14] B. Delyon and F. Portier. Safe and adaptive importance sampling: A mixture approach. Annals of Statistics, Mar. 2020. · Zbl 1472.62016
[15] Y. Du and M. G. Akritas. Uniform strong representation of the conditional Kaplan-Meier process.Mathematical Methods of Statistics, 11(2):152-182, 2002. · Zbl 1005.62082
[16] R. M. Dudley. Frechet Differentiability, p-Variation and Uniform Donsker Classes.Annals of Probability, 20(4):1968-1982, Oct. 1992. ISSN 0091-1798, 2168-894X. doi: 10.1214/aop/ · Zbl 0778.60026
[17] U. Einmahl and D. M. Mason. An Empirical Process Approach to the Uniform Consistency of Kernel-Type Function Estimators.Journal of Theoretical Probability, 13(1):1-37, Jan. 2000. ISSN 1572-9230. doi: 10.1023/A:1007769924157. · Zbl 1426.62113
[18] T. Fleming and D. Harrington. Counting Processes and Survival Analysis. 1991. doi: 10.2307/2290673. · Zbl 0727.62096
[19] T. A. Gerds, J. Beyersmann, L. Starkopf, S. Frank, M. J. van der Laan, and M. Schumacher. The Kaplan-Meier Integral in the Presence of Covariates: A Review.From Statistics to · Zbl 1383.62215
[20] E. Gin´e and A. Guillou. On consistency of kernel density estimators for randomly censored data: Rates holding uniformly over adaptive intervals.Annales de l’Institut Henri · Zbl 0974.62030
[21] E. Gin´e and H. Sang. Uniform asymptotics for kernel density estimators with variable bandwidths.Journal of Nonparametric Statistics, 22(6):773-795, Aug. 2010. ISSN 10485252. doi: 10.1080/10485250903483331. · Zbl 1328.62233
[22] E. Gin´e, V. Koltchinskii, and J. Zinn. Weighted uniform consistency of kernel density estimators.Annals of Probability, 32(3B):2570-2605, July 2004. ISSN 0091-1798, 2168894X. doi: 10.1214/009117904000000063. · Zbl 1052.62034
[23] R. L. Grossman, A. P. Heath, V. Ferretti, H. E. Varmus, D. R. Lowy, W. A. Kibbe, and L. M. Staudt. Toward a Shared Vision for Cancer Genomic Data.New England Journal of
[24] L. Gy¨orfi, M. Kohler, A. Krzyzak, and H. Walk.A Distribution-Free Theory of Nonparametric Regression. Springer Series in Statistics. Springer-Verlag, New York, 2002. ISBN 978-0 · Zbl 1021.62024
[25] T. Hothorn, P. B¨uhlmann, S. Dudoit, A. Molinaro, and M. J. van der Laan. Survival ensembles.Biostatistics, 7(3):355-373, 2006. ISSN 14654644. doi: 10.1093/biostatistics/ kxj011. · Zbl 1170.62385
[26] H. Ishwaran and U. B. Kogalur. Random survival forests for R.R News, 7(2):25-31, 2007.
[27] H. Ishwaran, U. B. Kogalur, E. H. Blackstone, and M. S. Lauer. Random survival forests. The Annals of Applied Statistics, 2(3):841-860, 2008. ISSN 19326157. doi: 10.1214/08-A · Zbl 1149.62331
[28] E. L. Kaplan and P. Meier. Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association, 53(282):457-481, 1958. ISSN 0162-1459. · Zbl 0089.14801
[29] M. Kohler, K. M´ath´e, and M. Pint´er. Prediction from Randomly Right Censored Data. Journal of Multivariate Analysis, 80(1):73-100, Jan. 2002. ISSN 0047259X. doi: 10.1006/ · Zbl 0992.62041
[30] G. Lecu´e and S. Mendelson. Learning subgaussian classes : Upper and minimax bounds. arXiv:1305.4825 [math, stat], Sept. 2016.
[31] O. Lopez. Nonparametric Estimation of the Multivariate Distribution Function in a Censored Regression Model with Applications.Communications in Statistics - Theory and Methods, 40(15):2639-2660, Aug. 2011. ISSN 0361-0926, 1532-415X. doi: 10.1080/03610926.2010.48 9175. · Zbl 1318.62297
[32] O. Lopez, V. Patilea, and I. van Keilegom. Single index regression models in the presence of censoring depending on the covariates.Bernoulli, 19(3):721-747, Aug. 2013. ISSN 1350-7265. doi: 10.3150/12-BEJ464. · Zbl 1273.62089
[33] G. Lugosi and S. Mendelson. Risk minimization by median-of-means tournaments.Journal of the European Mathematical Society, 22(3), 2016. · Zbl 1436.62312
[34] P. Major. An estimate on the supremum of a nice class of stochastic integrals and Ustatistics.Probability Theory and Related Fields, 134(3):489-537, Mar. 2006. ISSN 0178-8051, 1432-2064. doi: 10.1007/s00440-005-0440-9. · Zbl 1128.62063
[35] N. R. Mann, R. E. Schafer, and N. D. Singpurwalla.Methods for Statistical Analysis of Reliability and Life Data. Wiley, 1974. ISBN 978-0-471-56737-0. · Zbl 0339.62070
[36] A. M. Molinaro, S. Dudoit, and M. J. van der Laan. Tree-based multivariate regression and density estimation with right-censored data.Journal of Multivariate Analysis, 90(1 SPEC. ISS.):154-177, 2004. ISSN 10957243. doi: 10.1016/j.jmva.2004.02.003. · Zbl 1048.62046
[37] D. Nolan and D. Pollard. U-Processes: Rates of Convergence.Annals of Statistics, 15(2): 780-799, June 1987. ISSN 0090-5364, 2168-8966. doi: 10.1214/aos/1176350374. · Zbl 0624.60048
[38] S. J. Pan and Q. Yang. A Survey on Transfer Learning.IEEE Transactions on Knowledge and Data Engineering, 22(10):1345-1359, Oct. 2010. ISSN 1558-2191. doi: 10.1109/TK
[39] G. Papa, A. Bellet, and S. Cl´emen¸con. On Graph Reconstruction via Empirical Risk Minimization: Fast Learning Rates and Scalability. In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors,Advances in Neural Information Processing Systems 29, pages 694-702. Curran Associates, Inc., 2016.
[40] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, and V. Dubourg. Scikit-learn: Machine learning in Python. · Zbl 1280.68189
[41] S. P¨olsterl. Scikit-survival: A Library for Time-to-Event Analysis Built on Top of scikit-learn. Journal of Machine Learning Research, 21(212):1-6, 2020. ISSN 1533-7928.
[42] S. P¨olsterl, N. Navab, and A. Katouzian. Fast Training of Support Vector Machines for Survival Analysis. In A. Appice, P. P. Rodrigues, V. Santos Costa, J. Gama, A. Jorge, and C. Soares, editors,Machine Learning and Knowledge Discovery in Databases, Lecture Notes in Computer Science, pages 243-259. Springer International Publishing, 2015. ISBN 978-3-319-23525-7.
[43] S. P¨olsterl, N. Navab, and A. Katouzian. An Efficient Training Algorithm for Kernel Survival Support Vector Machines. InCML PKDD MLLS 2016, 2016.
[44] F. Portier and J. Segers. On the weak convergence of the empirical conditional copula under a simplifying assumption.Journal of Multivariate Analysis, 166(C):160-181, 2018. · Zbl 1401.62082
[45] A. Rotnitzky and J. M. Robins. Recovery of Information and Adjustment for Dependent Censoring Using Surrogate Markers.AIDS Epidemiology, 88(424):1473, 1992. ISSN 01621459. doi: 10.1007/978-1-4757-1229-2 14.
[46] P. Royston and M. K. B. Parmar. The use of restricted mean survival time to estimate the treatment effect in randomized clinical trials when the proportional hazards assumption is in doubt.Statistics in Medicine, 30(19):2409-2421, 2011.ISSN 1097-0258.doi: 10.1002/sim.4274.
[47] D. Rubin and M. J. van der Laan. A Doubly Robust Censoring Unbiased Transformation. The International Journal of Biostatistics, 3(1), Jan. 2007. ISSN 1557-4679. doi: 10.2202/ · Zbl 1133.62325
[48] G. R. Shorack and J. A. Wellner.Empirical Processes with Applications to Statistics. Classics in Applied Mathematics. Society for Industrial and Applied Mathematics, Jan. 2009. ISBN 978-0-89871-684-9. doi: 10.1137/1.9780898719017. · Zbl 1171.62057
[49] J. A. Steingrimsson and S. Morrison. Deep learning for survival outcomes.Statistics in Medicine, 39(17):2339-2349, 2020. ISSN 1097-0258. doi: 10.1002/sim.8542.
[50] J. A. Steingrimsson, L. Diao, A. M. Molinaro, and R. L. Strawderman. Doubly robust survival trees: Doubly Robust Survival Trees.Statistics in Medicine, 35(20):3595-3612, Sept. 2016. ISSN 02776715. doi: 10.1002/sim.6949.
[51] J. A. Steingrimsson, L. Diao, and R. L. Strawderman. Censoring Unbiased Regression Trees and Ensembles.Journal of the American Statistical Association, 114(525):370-383, Jan. 2019. ISSN 0162-1459, 1537-274X. doi: 10.1080/01621459.2017.1407775. · Zbl 1478.62289
[52] W. Stute. Consistent estimation under random censorship when covariables are present. Journal of Multivariate Analysis, 45(1), 1993. doi: 10.1006/jmva.1993.1028. · Zbl 0767.62036
[53] W. Stute. The Central Limit Theorem Under Random Censorship.Annals of Statistics, 23 (2):422-439, Apr. 1995. ISSN 0090-5364, 2168-8966. doi: 10.1214/aos/1176324528. · Zbl 0829.62055
[54] W. Stute. Distributional Convergence under Random Censorship when Covariables are Present.Scandinavian Journal of Statistics, 23(4):461-471, 1996. · Zbl 0903.62045
[55] V. Van Belle, K. Pelckmans, J. A. K. Suykens, and S. Van Huffel. Support Vector Machines for Survival Analysis.Proceedings of the Third International Conference on Computational · Zbl 1280.62117
[56] V. Van Belle, K. Pelckmans, J. A. K. Suykens, and S. V. Huffel. Learning Transformation Models for Ranking and Survival Analysis.Journal of Machine Learning Research, 12: 819-862, 2011. ISSN 15324435. · Zbl 1280.62117
[57] M. J. van der Laan and J. M. Robins.Unified Methods for Censored Longitudinal Data and Causality. Springer Series in Statistics. Springer New York, New York, NY, first edition, · Zbl 1013.62034
[58] I. van Keilegom and N. Veraverbeke. Uniform strong convergence results for the conditional kaplan-meier estimator and its quantiles.Communications in Statistics - Theory and · Zbl 0870.62038
[59] M. P. Wand and M. C. Jones.Kernel Smoothing. Number 60. Chapman & Hall, Boca Raton, FL, U.S., Dec. 1994. ISBN 978-0-412-55270-0.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.