×

Recent advances in cluster analysis. (English) Zbl 1178.68514

Summary: The purpose of this paper is to provide a review of the issues related to cluster analysis, one of the most important and primitive activities of human beings, and of the advances made in recent years.
The paper investigates the clustering algorithms rooted in machine learning, computer science, statistics, and computational intelligence.
The paper reviews the basic issues of cluster analysis and discusses the recent advances of clustering algorithms in scalability, robustness, visualization, irregular cluster shape detection, and so on.
The paper presents a comprehensive and systematic survey of cluster analysis and emphasizes its recent efforts in order to meet the challenges caused by the glut of complicated data from a wide variety of communities.

MSC:

68T10 Pattern recognition, speech recognition
68T05 Learning and adaptive systems in artificial intelligence
Full Text: DOI

References:

[1] DOI: 10.1145/1276958.1276960 · doi:10.1145/1276958.1276960
[2] DOI: 10.1109/TPAMI.1981.4767051 · Zbl 0454.62057 · doi:10.1109/TPAMI.1981.4767051
[3] DOI: 10.1002/bs.3830120210 · doi:10.1002/bs.3830120210
[4] DOI: 10.1109/TNN.2002.1000130 · doi:10.1109/TNN.2002.1000130
[5] Belkin, M. and Niyogi, P. (2002), ”Laplacian eigenmaps for dimensionality reduction and data representation”,Neural Computation, Vol. 13, pp. 1373-96. · Zbl 1085.68119
[6] Ben-Hur, A., Horn, D., Siegelmann, H. and Vapnik, V. (2001), ”Support vector clustering”,Journal of Machine Learning Research, Vol. 2, pp. 125-37. · Zbl 1002.68598
[7] Beyer, K., Goldstein, J., Ramakrishnan, R. and Shaft, U. (1999), ”When is nearest neighbor meaningful?”,Proceedings of 7th International Conference on Database Theory, pp. 217-35. · doi:10.1007/3-540-49257-7_15
[8] DOI: 10.1016/S0734-189X(87)80014-2 · Zbl 0634.68089 · doi:10.1016/S0734-189X(87)80014-2
[9] DOI: 10.1016/0893-6080(91)90056-B · doi:10.1016/0893-6080(91)90056-B
[10] DOI: 10.1109/ICNSC.2004.1297047 · doi:10.1109/ICNSC.2004.1297047
[11] DOI: 10.1109/TFUZZ.2003.814839 · doi:10.1109/TFUZZ.2003.814839
[12] DOI: 10.1016/j.acha.2006.04.006 · Zbl 1095.68094 · doi:10.1016/j.acha.2006.04.006
[13] Corchado, J. and Fyfe, C. (2000), ”A comparison of kernel methods for instantiating case based reasoning systems”,Computing and Information Systems, Vol. 7, pp. 29-42.
[14] DOI: 10.1109/3477.484436 · doi:10.1109/3477.484436
[15] Forgy, E. (1965), ”Cluster analysis of multivariate data: efficiency vs. interpretability of classifications”,Biometrics, Vol. 21, pp. 768-80.
[16] DOI: 10.1080/01621459.1987.10478427 · doi:10.1080/01621459.1987.10478427
[17] Ganti, V., Ramakrishnan, R., Gehrke, J., Powell, A. and French, J. (1999), ”Clustering large datasets in arbitrary metric spaces”,Proceedings of the 15th International Conference on Data Engineering, pp. 502-11. · doi:10.1109/ICDE.1999.754966
[18] DOI: 10.1109/TNN.2002.1000150 · doi:10.1109/TNN.2002.1000150
[19] DOI: 10.1126/science.286.5439.531 · doi:10.1126/science.286.5439.531
[20] Gorban, A., Pitenko, A., Zinovyev, A. and Wunsch, D. II (2001), ”Visualization of any data using elastic map method”,Smart Engineering System Design, Vol. 11, pp. 363-8.
[21] DOI: 10.1145/276305.276312 · doi:10.1145/276305.276312
[22] DOI: 10.1016/S0306-4379(00)00022-3 · doi:10.1016/S0306-4379(00)00022-3
[23] DOI: 10.1109/TKDE.2003.1198387 · doi:10.1109/TKDE.2003.1198387
[24] Hansen, P. and Jaumard, B. (1997), ”Cluster analysis and mathematical programming”,Mathematical Programming, Vol. 79, pp. 191-215. · Zbl 0887.90182 · doi:10.1007/BF02614317
[25] Hyvärinen, A. (1999), ”Survey of independent component analysis”,Neural Computing Surveys, Vol. 2, pp. 94-128.
[26] DOI: 10.1109/34.824819 · doi:10.1109/34.824819
[27] DOI: 10.1145/331499.331504 · doi:10.1145/331499.331504
[28] Jenssen, R. and Eltoft, T. (2006), ”An information theoretic perspective to kernel K-means”,Proceedings of IEEE International Workshop on Machine Learning for Signal Processing – MLSP2006, pp. 161-6. · doi:10.1109/MLSP.2006.275541
[29] DOI: 10.1007/s11265-006-9771-8 · doi:10.1007/s11265-006-9771-8
[30] DOI: 10.1109/TPAMI.2005.226 · doi:10.1109/TPAMI.2005.226
[31] DOI: 10.1109/2.781637 · doi:10.1109/2.781637
[32] DOI: 10.1126/science.220.4598.671 · Zbl 1225.90162 · doi:10.1126/science.220.4598.671
[33] DOI: 10.1109/3477.764879 · doi:10.1109/3477.764879
[34] DOI: 10.1109/TPAMI.2006.66 · doi:10.1109/TPAMI.2006.66
[35] DOI: 10.1109/TPAMI.2005.47 · doi:10.1109/TPAMI.2005.47
[36] MacQueen, J. (1967), ”Some methods for classification and analysis of multivariate observations”,Proceedings of the Fifth Berkeley Symposium, Vol. 1, pp. 281-97. · Zbl 0214.46201
[37] DOI: 10.1109/CEC.2003.1299577 · doi:10.1109/CEC.2003.1299577
[38] DOI: 10.1007/BF02294245 · doi:10.1007/BF02294245
[39] Moore, B. (1989), ”ART1 and pattern clustering”,Proceedings of the 1988 Connectionist Models Summer School, pp. 174-85.
[40] DOI: 10.1109/72.914517 · doi:10.1109/72.914517
[41] DOI: 10.1016/j.patcog.2006.03.012 · Zbl 1093.68639 · doi:10.1016/j.patcog.2006.03.012
[42] DOI: 10.1109/TKDE.2002.1033770 · doi:10.1109/TKDE.2002.1033770
[43] DOI: 10.1126/science.290.5500.2323 · doi:10.1126/science.290.5500.2323
[44] DOI: 10.1002/int.20111 · Zbl 1091.90086 · doi:10.1002/int.20111
[45] DOI: 10.1162/089976698300017467 · doi:10.1162/089976698300017467
[46] DOI: 10.1126/science.290.5500.2319 · doi:10.1126/science.290.5500.2319
[47] Xu, X., Ester, M., Kriegel, H. and Sander, J. (1998), ”A distribution-based clustering algorithm for mining in large spatial databases”,Proceedings of 14th International Conference on Data Engineering, pp. 324-31.
[48] DOI: 10.1109/ICONIP.2002.1198191 · doi:10.1109/ICONIP.2002.1198191
[49] DOI: 10.1016/S0019-9958(65)90241-X · Zbl 0139.24606 · doi:10.1016/S0019-9958(65)90241-X
[50] DOI: 10.1145/235968.233324 · doi:10.1145/235968.233324
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.