×

Mining Bayesian networks from direct marketing databases with missing values. (English) Zbl 1179.68050

Gen, Mitsuo (ed.) et al., Intelligent and evolutionary systems. Berlin: Springer (ISBN 978-3-540-95977-9/hbk; 978-3-540-95978-6/ebook). Studies in Computational Intelligence 187, 13-35 (2009).
Summary: Discovering knowledge from huge databases with missing values is a challenging problem in Data Mining. In this paper, a novel hybrid algorithm for learning knowledge represented in Bayesian Networks is discussed. The new algorithm combines an evolutionary algorithm with the expectation-maximization algorithm to overcome the problem of getting stuck in sub-optimal solutions which occurs in most existing learning algorithms. The experimental results on the databases generated from several benchmark network structures illustrate that our system outperforms some state-of-the-art algorithms. We also apply our system to a direct marketing problem, and compare the performance of the discovered Bayesian networks with the response models obtained by other algorithms. In the comparison, the Bayesian networks learned by our system outperform others.
For the entire collection see [Zbl 1161.68003].

MSC:

68P15 Database theory
90B60 Marketing, advertising
Full Text: DOI

References:

[1] Jensen, F.V.: An Introduction to Bayesian Network. University of College London Press (1996)
[2] Andreassen, S., Woldbye, M., Falck, B., Andersen, S.: MUNIN: A Causal Probabilistic Network for Interpretation of Electromyographic Findings. In: Proceedings of the Tenth International Joint Conference on Artificial Intelligence, pp. 366-372 (1987)
[3] Cheeseman, P., Kelly, J., Self, M., Stutz, J., Taylor, W., Freeman, D.: AutoClass: a Bayesian classification system. In: Proceedings of the Fifth International Workshop on Machine Learning, pp. 54-64 (1988)
[4] Heckerman, D., Horvitz, E.: Inferring Informational Goals from Free-Text Queries: A Bayesian Approach. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp. 230-237 (1998)
[5] Heckerman, D.; Wellman, M. P., Bayesian Networks, Communications of the ACM, 38, 3, 27-30 (1995) · doi:10.1145/203330.203336
[6] Cheng, J.; Greiner, R.; Kelly, J.; Bell, D.; Liu, W., Learning Bayesian Networks from Data: An Information-Theory Based Approach, Artificial Intelligence, 137, 43-90 (2002) · Zbl 0995.68114 · doi:10.1016/S0004-3702(02)00191-1
[7] Spirtes, P.; Glymour, C.; Scheines, R., Causation, Prediction, and Search (2000), Cambridge: MIT Press, Cambridge
[8] Cooper, G.; Herskovits, E., A Bayesian Method for the Induction of Probabilistic Networks from Data, Machine Learning, 9, 4, 309-347 (1992) · Zbl 0766.68109
[9] Heckerman, D.: A Tutorial on Learning Bayesian Networks. Tech. Rep. MSR-TR-95-06. Microsoft Research Adv. Technol. Div., Redmond, WA (1995) · Zbl 0921.62029
[10] Lam, W.; Bacchus, F., Learning Bayesian belief networks: an approach based on the MDL principle, Computational Intelligence, 10, 4, 269-293 (1994) · doi:10.1111/j.1467-8640.1994.tb00166.x
[11] Larrañaga, P.; Poza, M.; Yurramendi, Y.; Murga, R.; Kuijpers, C., Structure Learning of Bayesian Network by Genetic Algorithms: A Performance Analysis of Control Parameters, IEEE Transactions on Pattern Analysis and Machine Intelligence, 18, 9, 912-926 (1996) · doi:10.1109/34.537345
[12] Larrañaga, P.; Kuijpers, C.; Mura, R.; Yurramendi, Y., Learning Bayesian Network Structures by Searching for The Best Ordering with Genetic Algorithms, IEEE Transactions on System, Man and Cybernetics, 26, 4, 487-493 (1996) · doi:10.1109/3468.508827
[13] Wong, M. L.; Lam, W.; Leung, K. S., Using Evolutionary Programming and Minimum Description Length principle for data mining of Bayesian networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, 21, 2, 174-178 (1999) · doi:10.1109/34.748825
[14] Wong, M. L.; Leung, K. S., An Efficient Data Mining Method for Learning Bayesian Networks Using an Evolutionary Algorithm-Based Hybrid Approach, IEEE Transactions on Evolutionary Computation, 8, 4, 378-404 (2004) · doi:10.1109/TEVC.2004.830334
[15] Schafer, J. L.; Graham, J. W., Missing data: Our View of the State of the Art, Psychological Methods, 7, 2, 147-177 (2002) · doi:10.1037/1082-989X.7.2.147
[16] Ramoni, M., Sebastiani, P.: Efficient Parameter Learning in Bayesian Networks from Incomplete Databases. Tech. Rep. KMI-TR-41 (1997)
[17] Ramoni, M., Sebastiani, P.: The Use of Exogenous Knowledge to Learn Bayesian Networks from Incomplete Databases. Tech. Rep. KMI-TR-44 (1997)
[18] Friedman, N.: Learning Belief Networks in the Presence of Missing Values and Hidden Variables. In: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 125-133 (1997)
[19] Friedman, N.: The Bayesian Structural EM Algorithm. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp. 80-89 (1998)
[20] Peña, J. M.; Lozano, J. A.; Larrañaga, P., An Improved Bayesian Structural EM Algorithm for Learning Bayesian Networks for Clustering, Pattern Recognition Letters, 21, 779-786 (2000) · doi:10.1016/S0167-8655(00)00038-6
[21] Peña, J. M.; Lozano, J. A.; Larrañaga, P., Learning Recursive Bayesian Multinets for Data Clustering by Means of Constructive Induction, Machine Learning, 47, 63-89 (2002) · Zbl 1012.68153 · doi:10.1023/A:1013683712412
[22] Myers, J., Laskey, K., DeJong, K.: Learning Bayesian Networks from Incomplete Data using Evolutionary Algorithms. In: Proceedings of the First Annual Conference on Genetic and Evolutionary Computation Conference, pp. 458-465 (1999)
[23] Pearl, J., Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference (1998), San Mateo: Morgan Kaufmann, San Mateo
[24] Dempster, A. P.; Laird, N. M.; Rubin, D. B., Maximum Likelihood from Incomplete Data via the EM Algorithm, Journal of the Royal Statistical Society(B), 39, 1, 1-38 (1977) · Zbl 0364.62022
[25] Lauritzen, S., The EM Algorithm for Graphical Association Models with Missing Data, Computational Statistics and Data Analysis, 19, 191-201 (1995) · Zbl 0875.62237 · doi:10.1016/0167-9473(93)E0056-A
[26] Huang, C.; Darwiche, A., Inference in Belief Networks: a Procedural Guide, International Journal of Approximate Reasoning, 15, 3, 225-263 (1996) · Zbl 0941.68767 · doi:10.1016/S0888-613X(96)00069-2
[27] LibB, http://compbio.cs.huji.ac.il/LibB/
[28] Bayesware Discoverer, http://www.bayesware.com/frontpage.html
[29] Norsys Bayes Net Library, http://www.norsys.com/net_library.htm
[30] Chickering, D. M., Learning Equivalence Classes of Bayesian Network Structures, Journal of Machine Learning Research, 2, 445-498 (2002) · Zbl 1007.68179 · doi:10.1162/153244302760200696
[31] Beaumont, G. P.; Knowles, J. D., Statistical Tests: An Introduction with MINITAB Commentary (1996), Englewood Cliffs: Prentice-Hall, Englewood Cliffs · Zbl 0837.62001
[32] Zahavi, J.; Levin, N., Issues and Problems in Applying Neural Computing to Target Marketing, Journal of Direct Marketing, 11, 4, 63-75 (1997) · doi:10.1002/(SICI)1522-7138(199723)11:4<63::AID-DIR9>3.0.CO;2-U
[33] Bhattacharyya, S.: Direct Marketing Response Models using Genetic Algorithms. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pp. 144-148 (1998)
[34] Cabena, P.; Hadjinian, P.; Stadler, R.; Verhees, J.; Zanasi, A., Discovering Data Mining: From Concept to Implementation (1997), Englewood Cliffs: Prentice-Hall, Englewood Cliffs
[35] Petrison, L. A.; Blattberg, R. C.; Wang, P., Database Marketing: Past, Present, and Future, Journal of Direct Marketing, 11, 4, 109-125 (1997) · doi:10.1002/(SICI)1522-7138(199723)11:4<109::AID-DIR12>3.0.CO;2-G
[36] Bhattacharyya, S.: Evolutionary Algorithms in Data Mining: Multi-Objective Performance Modeling for Direct Marketing. In: Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining, pp. 465-473 (2000)
[37] Zahavi, J.; Levin, N., Applying Neural Computing to Target Marketing, Journal of Direct Marketing, 11, 4, 76-93 (1997) · doi:10.1002/(SICI)1522-7138(199723)11:4<76::AID-DIR10>3.0.CO;2-D
[38] Ling, C.X., Li, C.H.: Data Mining for Direct Marketing: Problems and Solutions. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pp. 73-79 (1998)
[39] Friedman, N.; Geiger, D.; Goldszmidt, M., Bayesian Network Classifiers, Machine Learning, 29, 131-163 (1997) · Zbl 0892.68077 · doi:10.1023/A:1007465528199
[40] Rud, O. P., Data Mining Cookbook: Modeling Data for Marketing, Risk and Customer Relationship Management (2001), New York: Wiley, New York
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.