×

A penalized likelihood based pattern classification algorithm. (English) Zbl 1175.68351

Summary: Penalized likelihood is a general approach whereby an objective function is defined, consisting of the log likelihood of the data minus some term penalizing non-smooth solutions. Subsequently, this objective function is maximized, yielding a solution that achieves some sort of trade-off between the faithfulness and the smoothness of the fit. Most work on that topic focused on the regression problem, and there has been little work on the classification problem. In this paper we propose a new classification method using the concept of penalized likelihood (for the two class case). By proposing a novel penalty term based on the \(K\)-nearest neighbors, simple analytical derivations have led to an algorithm that is proved to converge to the global optimum. Moreover, this algorithm is very simple to implement and converges typically in two or three iterations. We also introduced two variants of the method by distance-weighting the \(K\)-nearest neighbor contributions, and by tackling the unbalanced class patterns situation. We performed extensive experiments to compare the proposed method to several well-known classification methods. These simulations reveal that the proposed method achieves one of the top ranks in classification performance and with a fairly small computation time.

MSC:

68T10 Pattern recognition, speech recognition

Software:

UCI-ml

References:

[1] D.J. Asuncion, UCI Machine Learning Repository \(\langle;\) http://www.ics.uci.edu/∼;mlearn/MLRepository.html \(\rangle;\), 2007.; D.J. Asuncion, UCI Machine Learning Repository \(\langle;\) http://www.ics.uci.edu/∼;mlearn/MLRepository.html \(\rangle;\), 2007.
[2] Atiya, A. F., Estimating the posterior probabilities using the \(k\)-nearest neighbor rule, Neural Computation, 17, 731-740 (2006) · Zbl 1089.68118
[3] Bailey, T.; Jain, A., A note on distance-weighted \(k\)-nearest neighbor rules, IEEE Transactions on Systems Man and Cybernetics, 8, 311-313 (1978) · Zbl 0383.68067
[4] Bazaraa, M.; Shetty, C., Nonlinear Programming (1979), Wiley: Wiley New York · Zbl 0476.90035
[5] Berry, S. M.; Carroll, R. J.; Ruppert, D., Bayesian smoothing and regression splines for measurement error problems, Journal of American Statistical Association, 97, 457, 160-169 (2002) · Zbl 1073.62524
[6] G. Cawley, N.L. Talbot, M. Girolami, Sparse multinomial logistic regression via Bayesian 11 regularisation, in: Proceedings of the NIPS, 2007, pp. 209-216.; G. Cawley, N.L. Talbot, M. Girolami, Sparse multinomial logistic regression via Bayesian 11 regularisation, in: Proceedings of the NIPS, 2007, pp. 209-216.
[7] Duda, R. O.; Hart, P. E.; Stork, D. G., Pattern Classification (2000), Wiley: Wiley New York
[8] Dudani, S., The distance weighted \(k\)-nearest-neighbor rule, IEEE Transactions on Systems Man Cybernetics, 6, 325-327 (1976)
[9] J. Goldberger, S. Roweis, G. Hinton, R. Salakhutdinov, Neighbourhood components analysis, in: Advances in Neural Information Processing Systems, vol. 17, 2004.; J. Goldberger, S. Roweis, G. Hinton, R. Salakhutdinov, Neighbourhood components analysis, in: Advances in Neural Information Processing Systems, vol. 17, 2004.
[10] Granger, C. W.J.; Jeon, Y., Thick modeling, Economic Modeling, 21, 2, 323-343 (2004)
[11] P. Green, Penalized likelihood, in: Encyclopedia of Statistical Sciences, Update vol. 3, 1999.; P. Green, Penalized likelihood, in: Encyclopedia of Statistical Sciences, Update vol. 3, 1999.
[12] Green, P. J.; Silverman, B. W., Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach (1994), Chapman and Hall: Chapman and Hall London · Zbl 0832.62032
[13] Gu, C., Cross-validating non-Gaussian data, Journal of Computational and Graphical Statistics, 1, 169-179 (1992)
[14] Gu, C.; Kim, Y.-J., Penalized likelihood regression: general formulation and efficient approximation, Canadian Journal of Statistics, 29 (2002) · Zbl 1018.62032
[15] Hastie, T.; Tibshirani, R., Generalized Additive Models (1990), Chapman and Hall: Chapman and Hall London · Zbl 0747.62061
[16] Holmes, C. C.; Adams, N. M., A probabilistic nearest neighbour method for statistical pattern recognition, Journal of the Royal Statistical Society Series B, 64, 295-306 (2002) · Zbl 1059.62065
[17] Jensen, R.; Erdogmus, D.; Principe, J. C.; Eltoft, T., The Laplacian classifier, IEEE Transactions on Signal Processing, 55, 7, 3262-3271 (2007) · Zbl 1390.35060
[18] Loader, C., Local Regression and Likelihood (1999), Springer: Springer Berlin · Zbl 0929.62046
[19] F. Lu, G.C. Hill, G. Wahba, P. Desiati, Signal probability estimation with penalized likelihood method on weighted data, Department of Statistics, University of Wisconsin, Technical Report, No. 1106, 2005.; F. Lu, G.C. Hill, G. Wahba, P. Desiati, Signal probability estimation with penalized likelihood method on weighted data, Department of Statistics, University of Wisconsin, Technical Report, No. 1106, 2005. · Zbl 1096.62051
[20] O’Sullivan, F.; Yandell, B.; Raynor, W., Automatic smoothing of regression functions in generalized linear models, Journal of American Statistical Association, 81, 96-103 (1986)
[21] P. Paalanen, J. Kamarainen, H. Kalviainen \(\langle;\) http://www.it.lut.fi/project/gmmbayes/\( \rangle;\), 2007.; P. Paalanen, J. Kamarainen, H. Kalviainen \(\langle;\) http://www.it.lut.fi/project/gmmbayes/\( \rangle;\), 2007.
[22] Provost, F.; Fawcett, T., Robust classification of imprecise environments, Machine Learning, 42, 203-231 (2001) · Zbl 0969.68126
[23] Ramoser, H.; Muller-Gerking, J.; Pfurtscheller, G., Optimal spatial filtering of single trial eeg during imagined hand movement, IEEE Transactions on Rehabilitation Engineering, 8, 441-446 (2000)
[24] C.E. Rasmussen \(\langle;\) http://www.GaussianProcess.org/gpml/code/index.html \(\rangle;\), 2007.; C.E. Rasmussen \(\langle;\) http://www.GaussianProcess.org/gpml/code/index.html \(\rangle;\), 2007.
[25] Rasmussen, C. E.; Williams, C. K.I., Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) (2005), MIT Press: MIT Press Cambridge, MA
[26] Scholkopf, B.; Smola, A. J., Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond (2001), MIT Press: MIT Press Cambridge, MA
[27] Silverman, B. W., Density Estimation for Statistics and Data Analysis (1986), Chapman and Hall: Chapman and Hall London · Zbl 0617.62042
[28] Swets, J., Measuring the accuracy of diagnostic systems, Science, 240, 1285-1293 (1988) · Zbl 1226.92048
[29] Wahba, G., Spline Models for Observational Data (1990), SIAM: SIAM Philadelphia, PA · Zbl 0813.62001
[30] Wahba, G., Soft and hard classification by reproducing kernel Hilbert space methods, Proceedings of the National Academy of Sciences, 99, 16524-16530 (2002) · Zbl 1106.62338
[31] G. Wahba, C. Gu, Y. Wang, R. Chappell, Soft classification, a.k.a. risk estimation, via penalized log likelihood and smoothing spline analysis of variance, Department of Statistics, University of Wisconsin, Technical Report, No. 899, 1993.; G. Wahba, C. Gu, Y. Wang, R. Chappell, Soft classification, a.k.a. risk estimation, via penalized log likelihood and smoothing spline analysis of variance, Department of Statistics, University of Wisconsin, Technical Report, No. 899, 1993. · Zbl 0861.68079
[32] Webb, G., Multiboosting: a technique for combining boosting and wagging, Machine Learning, 40, 159-196 (2000)
[33] Zhang, C.-X.; Zhang, J.-S., Rotboost: a technique for combining rotation forest and adaboost, Pattern Recognition Letters, 29, 1524-1536 (2008)
[34] Zouhal, L.; Denoeux, T., An evidence-theoretic \(k\)-nn rule with parameter optimization, IEEE Transactions on Systems Man and Cybernetics, 28, 263-271 (1988)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.