×

Generalization analysis of Fredholm kernel regularized classifiers. (English) Zbl 1456.68148

Summary: Recently, a new framework, Fredholm learning, was proposed for semisupervised learning problems based on solving a regularized Fredholm integral equation. It allows a natural way to incorporate unlabeled data into learning algorithms to improve their prediction performance. Despite rapid progress on implementable algorithms with theoretical guarantees, the generalization ability of Fredholm kernel learning has not been studied. In this letter, we focus on investigating the generalization performance of a family of classification algorithms, referred to as Fredholm kernel regularized classifiers. We prove that the corresponding learning rate can achieve \(\mathcal{O}(l^{-1})\) (\(l \) is the number of labeled samples) in a limiting case. In addition, a representer theorem is provided for the proposed regularized scheme, which underlies its applications.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
45B05 Fredholm integral equations
62H30 Classification and discrimination; cluster analysis (statistical aspects)
Full Text: DOI

References:

[1] Aronszajn (1950). Theory of reproducing kernels. Trans. Amer. Math. Soc., 68, 337-404. , · Zbl 0037.20701
[2] Bartlett, P., & Mendelson, S. (2002). Rademacher and gaussian complexities: Risk bounds and structural results. J. Mach. Learn. Res., 3, 463-482. · Zbl 1084.68549
[3] Belkin, M., & Niyogi, P. (2006). Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res., 7, 2399-2434. · Zbl 1222.68144
[4] Chen, D., Wu, Q., Ying, Y., & Zhou, D. (2004). Support vector machine soft margin classifiers: Error analysis. J. Mach. Learn. Res., 5, 1143-1175. · Zbl 1222.68167
[5] Chen, H., Pan, Z., Li, L., & Tang Y., (2013). Learning rates of coefficient-based regularized classifier for density level detection. Neural Comput., 25, 1107-1121. , · Zbl 1269.62037
[6] Chen, H., Zhou, Y., Tang, Y., Li, L., & Pan, Z. (2013). Convergence rate of semisupervised greedy algorithm. Neural Networks, 44, 44-50. , · Zbl 1296.68123
[7] Cucker, F., & Smale, S. (2001). On the mathematical foundations of learning. Bull. Amer. Math. Soc., 39, 1-49. , · Zbl 0983.68162
[8] Cucker, F., & Smale, S. (2002). Best choices for regularization parameters in learning theory: On the bias-variance problem. Found. Comput. Math., 1, 413-428. , · Zbl 1057.68085
[9] Johnson, R., & Zhang, T. (2007). On the effectiveness of Laplacian normalization for graph semi-supervised learning. J. Mach. Learn. Res., 8, 1489-1517. · Zbl 1222.68227
[10] Lee, W., Bartlett, P., & Williamson, R. (1996). Efficient agnostic learning of neural networks with bound fan-in. IEEE. Trans. Inf. Theory, 42, 2118-2132. , · Zbl 0874.68253
[11] Que, Q., & Belkin, M. (2013). Inverse density as an inverse problem: The Fredholm equation approach. In C. J. C. Burges, L. Bolou, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural information processing systems, 26. Red Hook, NY: Curran.
[12] Que, Q., Belkin, M., & Wang, Y. (2014). Learning with Fredholm kernels. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in neural information processing systems, 27. Red Hook, NY: Curran.
[13] Shi, L., Feng, Y., & Zhou, D. (2011). Concentration estimates for learning with ##img##-regularizer and data dependent hypothesis space. Appl. Comput. Harmonic. Anal., 31, 286-302. , · Zbl 1221.68201
[14] Smale, S., & Zhou, D. (2007). Learning theory estimates via integral operators and their approximations. Constr. Approx., 26, 153-172. , · Zbl 1127.68088
[15] Steinwart, I., & Scovel, C. (2005). Fast rates for support vector machines. In Proceedings of the 18th Conference on Learning Theory.New York: Springer. , · Zbl 1137.68564
[16] Tsybakov, A. (2004). Optimal aggression of classifiers in statistical learning. Ann. Statist., 32, 135-166. , · Zbl 1105.62353
[17] Wu, Q., Ying, Y., & Zhou, D. (2007). Multi-kernel regularized classifiers. J. Complexity, 23, 108-134. , · Zbl 1171.65043
[18] Wu, Q., & Zhou, D. (2005). SVM soft margin classifiers: Linear programming versus quadratic programming. Neural Comp., 17, 1160-1187. , · Zbl 1108.90324
[19] Wu, Q., & Zhou, D. (2008). Learning with sample dependent hypothesis spaces. Comput. Math. Appl., 56, 2896-2907. , · Zbl 1165.68388
[20] Zhang, T. (2004). Statistical behavior and consistency of classification methods based on convex risk minimization. Ann. Statist., 32, 56-134. , · Zbl 1105.62323
[21] Zhou, D. (2002). The covering number in learning theory. J. Complexity, 18, 739-767. , · Zbl 1016.68044
[22] Zhou, D. (2003). Capacity of reproducing kernel space in learning theory. IEEE Trans. Inf. Theory, 49, 1743-1752. , · Zbl 1290.62033
[23] Zhou, D., & Jetter, K. (2006). Approximation with polynomial kernels and SVM classifiers. Adv. Comput. Math., 25, 323-344. , · Zbl 1095.68103
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.