Skip to main content
Log in

Generalization bounds of ERM algorithm with Markov chain samples

  • Published:
Acta Mathematicae Applicatae Sinica, English Series Aims and scope Submit manuscript

Abstract

One of the main goals of machine learning is to study the generalization performance of learning algorithms. The previous main results describing the generalization ability of learning algorithms are usually based on independent and identically distributed (i.i.d.) samples. However, independence is a very restrictive concept for both theory and real-world applications. In this paper we go far beyond this classical framework by establishing the bounds on the rate of relative uniform convergence for the Empirical Risk Minimization (ERM) algorithm with uniformly ergodic Markov chain samples. We not only obtain generalization bounds of ERM algorithm, but also show that the ERM algorithm with uniformly ergodic Markov chain samples is consistent. The established theory underlies application of ERM type of learning algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Azuma, K. Weighted sums of certain dependent random variables. Tohoku Math. J., 199: 357–367 (1967)

    Article  MathSciNet  Google Scholar 

  2. Bousquet, O. New approaches to statistical learning theory. Annals of the Institute of Statistical Machematics, 55: 371–389 (2003)

    MATH  MathSciNet  Google Scholar 

  3. Chen, D.R., Wu, Q., Ying, Y.M., Zhou, D.X. Support vector machine soft margin clossifiers: error analysis. Journal of Machine Learning Research, 5: 1143–1175 (2004)

    MATH  MathSciNet  Google Scholar 

  4. Cucker, F., Smale, S. On the mathematical foundations of learning. Bulletin of the American Mathematical Society, 39: 1–49 (2001)

    Article  MathSciNet  Google Scholar 

  5. Cucker, F., Smale, S. Best choices for regularization parameters in learning theory: on the bias-variance problem. Found. Comput. Math., 2: 413–428 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  6. Cucker, F., Zhou, D.X. Learning Theory: An Approximation Theory Viewpoint. Cambridge University Press, Cambridge, 2007

    Book  Google Scholar 

  7. DeVore, R., Kerkyacharian, G., Picard, D., Temlyakov, V. Approximation methods for supervised learning. Foundations of Computational Mathematics, 6: 3–58 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  8. Gamarnik, D. Extension of PAC framework to finite and countable markov chains. IEEE Trans. Inform. Theory, 49: 338–345 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  9. Kontorovich, L., Ramanan, K. Concentration inequalities for dependent random variables via the martingale method. Ann. Probab., 36: 2126–2158 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  10. Meyn, S.P., Tweedie, R.L. Markov chains and Stochastic Stability. Springer-Verlag, London, 1993

    Book  MATH  Google Scholar 

  11. Modha, S., Masry, E. Minimum complexity regression estimation with weakly dependent observations. IEEE Trans. Inform. Theory, 42: 2133–2145 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  12. Nobel, A., Dembo, A. A Note on uniform laws of averages for dependent processes. Statist. Probab. Lett., 17: 169–172 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  13. Roberts, G.O., Rosenthal, J.S. General state space Markov chains and MCMC algorithms. Probability Surveys, 1: 20–71 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  14. Smale, S., Zhou, D.X. Estimating the approximation error in learning theory. Anal. Appl., 1: 17–41 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  15. Smale, S., Zhou, D.X. Shannon sampling and function reconstruction from point values. Bull. Amer. Math. Soc., 419: 279–305 (2004)

    Article  MathSciNet  Google Scholar 

  16. Smale, S., Zhou, D.X. Online learning with Markov sampling. Anal. Appl., 7: 87–113 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  17. Steinwart, I., Hush, D., Scovel, C. Learning from dependent observations. Multivariate Anal., 100: 175–194 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  18. Sun, H.W., Wu, Q. Regularized least square regression with dependnet samples. Adv. Comput. Math., 32: 175–189 (2010)

    Article  MATH  MathSciNet  Google Scholar 

  19. Vapnik, V. Statistical Learning Theory. John Wiley, New York, 1998

    MATH  Google Scholar 

  20. Vidyasagar, M. Learning and generalization with applications to neural networks. Springer, London, 2nd edition, 2003

    Google Scholar 

  21. Xu, Y.L., Chen, D.R. Learning rates of regularized regression for exponentially strongly mixing sequence. J. Statist. Plann., 138: 2180–2189 (2008)

    Article  MATH  Google Scholar 

  22. Yu, B. Rates of convergence for empirical processes of stationary mixing sequences. Ann. Probab., 22: 94–114 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  23. Zou, B., Li, L.Q. The performance bounds of learning machines based on exponentially strongly mixing sequence. Computer and Mathematics with Applications, 53: 1050–1058 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  24. Zou, B., Zhang, H., Xu, Z.B. Learning from uniformly ergodic Markov chain samples. Journal of Complexity, 25: 188–200 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  25. Zou, B., Li, L.Q., Xu, Z.B. The generalization performance of ERM algorithm with strongly mixing observations. Machine Learning, 75: 275–295 (2009)

    Article  Google Scholar 

  26. Zou, B., Xu, Z.B., Chang, X.Y. Generalization performance of ERM Algorithm with V-geometrically ergodic Markov Chain samples. Adv. Comput. Math., 36(1): 99–114 (2012)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Zou.

Additional information

Supported by National 973 project (No. 2013CB329404), Key Project of NSF of China (No. 11131006), National Natural Science Foundation of China (No. 61075054, 61370000).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zou, B., Xu, Zb. & Xu, J. Generalization bounds of ERM algorithm with Markov chain samples. Acta Math. Appl. Sin. Engl. Ser. 30, 223–238 (2014). https://doi.org/10.1007/s10255-011-0096-4

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10255-011-0096-4

Keywords

2000 MR Subject Classification

Navigation