Generalization bounds of ERM algorithm with Markov chain samples

213 Accesses
5 Citations
Explore all metrics

Abstract

One of the main goals of machine learning is to study the generalization performance of learning algorithms. The previous main results describing the generalization ability of learning algorithms are usually based on independent and identically distributed (i.i.d.) samples. However, independence is a very restrictive concept for both theory and real-world applications. In this paper we go far beyond this classical framework by establishing the bounds on the rate of relative uniform convergence for the Empirical Risk Minimization (ERM) algorithm with uniformly ergodic Markov chain samples. We not only obtain generalization bounds of ERM algorithm, but also show that the ERM algorithm with uniformly ergodic Markov chain samples is consistent. The established theory underlies application of ERM type of learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Empirical risk minimization: probabilistic complexity and stepsize strategy

Article Open access 02 March 2019

Generalization and Robustness of Batched Weighted Average Algorithm with V-Geometrically Ergodic Markov Data

Robust statistical learning with Lipschitz and convex loss functions

Article 02 July 2019

References

Azuma, K. Weighted sums of certain dependent random variables. Tohoku Math. J., 199: 357–367 (1967)
Article MathSciNet Google Scholar
Bousquet, O. New approaches to statistical learning theory. Annals of the Institute of Statistical Machematics, 55: 371–389 (2003)
MATH MathSciNet Google Scholar
Chen, D.R., Wu, Q., Ying, Y.M., Zhou, D.X. Support vector machine soft margin clossifiers: error analysis. Journal of Machine Learning Research, 5: 1143–1175 (2004)
MATH MathSciNet Google Scholar
Cucker, F., Smale, S. On the mathematical foundations of learning. Bulletin of the American Mathematical Society, 39: 1–49 (2001)
Article MathSciNet Google Scholar
Cucker, F., Smale, S. Best choices for regularization parameters in learning theory: on the bias-variance problem. Found. Comput. Math., 2: 413–428 (2002)
Article MATH MathSciNet Google Scholar
Cucker, F., Zhou, D.X. Learning Theory: An Approximation Theory Viewpoint. Cambridge University Press, Cambridge, 2007
Book Google Scholar
DeVore, R., Kerkyacharian, G., Picard, D., Temlyakov, V. Approximation methods for supervised learning. Foundations of Computational Mathematics, 6: 3–58 (2006)
Article MATH MathSciNet Google Scholar
Gamarnik, D. Extension of PAC framework to finite and countable markov chains. IEEE Trans. Inform. Theory, 49: 338–345 (2003)
Article MATH MathSciNet Google Scholar
Kontorovich, L., Ramanan, K. Concentration inequalities for dependent random variables via the martingale method. Ann. Probab., 36: 2126–2158 (2008)
Article MATH MathSciNet Google Scholar
Meyn, S.P., Tweedie, R.L. Markov chains and Stochastic Stability. Springer-Verlag, London, 1993
Book MATH Google Scholar
Modha, S., Masry, E. Minimum complexity regression estimation with weakly dependent observations. IEEE Trans. Inform. Theory, 42: 2133–2145 (1996)
Article MATH MathSciNet Google Scholar
Nobel, A., Dembo, A. A Note on uniform laws of averages for dependent processes. Statist. Probab. Lett., 17: 169–172 (1993)
Article MATH MathSciNet Google Scholar
Roberts, G.O., Rosenthal, J.S. General state space Markov chains and MCMC algorithms. Probability Surveys, 1: 20–71 (2004)
Article MATH MathSciNet Google Scholar
Smale, S., Zhou, D.X. Estimating the approximation error in learning theory. Anal. Appl., 1: 17–41 (2003)
Article MATH MathSciNet Google Scholar
Smale, S., Zhou, D.X. Shannon sampling and function reconstruction from point values. Bull. Amer. Math. Soc., 419: 279–305 (2004)
Article MathSciNet Google Scholar
Smale, S., Zhou, D.X. Online learning with Markov sampling. Anal. Appl., 7: 87–113 (2009)
Article MATH MathSciNet Google Scholar
Steinwart, I., Hush, D., Scovel, C. Learning from dependent observations. Multivariate Anal., 100: 175–194 (2009)
Article MATH MathSciNet Google Scholar
Sun, H.W., Wu, Q. Regularized least square regression with dependnet samples. Adv. Comput. Math., 32: 175–189 (2010)
Article MATH MathSciNet Google Scholar
Vapnik, V. Statistical Learning Theory. John Wiley, New York, 1998
MATH Google Scholar
Vidyasagar, M. Learning and generalization with applications to neural networks. Springer, London, 2nd edition, 2003
Google Scholar
Xu, Y.L., Chen, D.R. Learning rates of regularized regression for exponentially strongly mixing sequence. J. Statist. Plann., 138: 2180–2189 (2008)
Article MATH Google Scholar
Yu, B. Rates of convergence for empirical processes of stationary mixing sequences. Ann. Probab., 22: 94–114 (1994)
Article MATH MathSciNet Google Scholar
Zou, B., Li, L.Q. The performance bounds of learning machines based on exponentially strongly mixing sequence. Computer and Mathematics with Applications, 53: 1050–1058 (2007)
Article MATH MathSciNet Google Scholar
Zou, B., Zhang, H., Xu, Z.B. Learning from uniformly ergodic Markov chain samples. Journal of Complexity, 25: 188–200 (2009)
Article MATH MathSciNet Google Scholar
Zou, B., Li, L.Q., Xu, Z.B. The generalization performance of ERM algorithm with strongly mixing observations. Machine Learning, 75: 275–295 (2009)
Article Google Scholar
Zou, B., Xu, Z.B., Chang, X.Y. Generalization performance of ERM Algorithm with V-geometrically ergodic Markov Chain samples. Adv. Comput. Math., 36(1): 99–114 (2012)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Mathematics and Computer Science, Hubei University, Wuhan, 430062, China
Bin Zou & Jie Xu
Institute for Information and System Science, School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, 710049, China
Bin Zou & Zong-ben Xu

Authors

Bin Zou
View author publications
You can also search for this author in PubMed Google Scholar
Zong-ben Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jie Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Zou.

Additional information

Supported by National 973 project (No. 2013CB329404), Key Project of NSF of China (No. 11131006), National Natural Science Foundation of China (No. 61075054, 61370000).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zou, B., Xu, Zb. & Xu, J. Generalization bounds of ERM algorithm with Markov chain samples. Acta Math. Appl. Sin. Engl. Ser. 30, 223–238 (2014). https://doi.org/10.1007/s10255-011-0096-4

Download citation

Received: 30 June 2008
Revised: 03 July 2010
Published: 25 July 2011
Issue Date: March 2014
DOI: https://doi.org/10.1007/s10255-011-0096-4

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Empirical risk minimization: probabilistic complexity and stepsize strategy

Generalization and Robustness of Batched Weighted Average Algorithm with V-Geometrically Ergodic Markov Data

Robust statistical learning with Lipschitz and convex loss functions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

2000 MR Subject Classification

Subscribe and save

Buy Now

Navigation

Generalization bounds of ERM algorithm with Markov chain samples

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Empirical risk minimization: probabilistic complexity and stepsize strategy

Generalization and Robustness of Batched Weighted Average Algorithm with V-Geometrically Ergodic Markov Data

Robust statistical learning with Lipschitz and convex loss functions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

2000 MR Subject Classification

Subscribe and save

Buy Now

Search

Navigation