×

Impact of minimum-cut density-balanced partitioning solutions in distributed webpage ranking. (English) Zbl 1442.90171

Summary: This paper presents a new mathematical programming model and a solution approach for a special class of graph partitioning problem. The problem studied here is in the context of distributed web search, in which a very large world-wide-web graph is partitioned to improve the efficiency of webpage ranking (known as PageRank). Although graph partitioning problems have been widely studied and there have been several computational algorithms and mathematical programming models in the literature, the graph partitioning problem for PageRank imposes unique constraints on the density balance. This problem is called the min-cut density-balanced partitioning problem. In this paper, we propose a new mathematical programming model and a solution approach to efficiently solve this min-cut density-balanced partitioning problem. As the objective on the minimum cut and the constraint on the density balance are not the direct performance measure of PageRank, we also investigate the performance of the solutions obtained from a MIP solver and our approach on the ranking’s accuracy and the local ranking’s computation times. The experiment results show both solutions are comparable in terms of the ranking’s accuracy and the local ranking’s computation times whereas it is much faster to obtain the partitioning solutions using our approach.

MSC:

90C27 Combinatorial optimization
90C11 Mixed integer programming

Software:

SNAP
Full Text: DOI

References:

[1] Andreev, K.; Racke, H., Balanced graph partitioning, Theory Comput. Syst., 39, 929-939 (2006) · Zbl 1113.68069 · doi:10.1007/s00224-006-1350-7
[2] Bourse, F., Lelarge, M., Vojnovic, M.: Balanced graph edge partition. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pp. 1456-1465. ACM, New York (2014). 10.1145/2623330.2623660
[3] Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proceedings of the Seventh International Conference on World Wide Web 7, WWW7, pp. 107-117. Elsevier Science Publishers B. V., Amsterdam (1998). http://dl.acm.org/citation.cfm?id=297805.297827
[4] Datta, D.; Figueira, JR, Graph partitioning by multi-objective real-valued metaheuristics: a comparative study, Appl. Soft Comput., 11, 5, 3976-3987 (2011) · doi:10.1016/j.asoc.2011.01.044
[5] Even, G., Naor, J.S., Rao, S., Schieber, B.: Fast approximate graph partitioning algorithms. In: Proceedings of the Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 639-648. Society for Industrial and Applied Mathematics, Philadelphia (1997) · Zbl 1321.05259
[6] Fagin, R., Kumar, R., Sivakumar, D.: Comparing top \(k\) lists. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pp. 28-36 (2003) · Zbl 1094.68562
[7] Feige, U.; Krauthgamer, R., A polylogarithmic approximation of the minimum bisection, SIAM J. Comput., 31, 4, 1090-1118 (2002) · Zbl 1015.68240 · doi:10.1137/S0097539701387660
[8] Figueiredo, R.; Moura, G., Mixed integer programming formulations for clustering problems related to structural balance, Social Netw., 35, 4, 639-651 (2013) · doi:10.1016/j.socnet.2013.09.002
[9] Garey, MR; Johnson, DS, Computers and Intractability: A Guide to the Theory of NP-Completeness (1990), New York: W. H. Freeman & Co., New York
[10] Kleinberg, JM, Authoritative sources in a hyperlinked environment, J. ACM, 46, 5, 604-632 (1999) · Zbl 1065.68660 · doi:10.1145/324133.324140
[11] Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection. http://snap.stanford.edu/data (2014)
[12] Parreira, J.X., Weikum, G.: JXP global authority scores in a P2P network. In: Proceedings of the Eight International Workshop on the Web and Databases (WebDB 2005), pp. 31-36. Baltimore, Maryland (2005)
[13] Salam, B.; Salman, FS; Sayn, S.; Terkay, M., A mixed-integer programming approach to the clustering problem with an application in customer segmentation, Eur. J. Oper. Res., 173, 3, 866-879 (2006) · Zbl 1131.90434 · doi:10.1016/j.ejor.2005.04.048
[14] Sangamuang, S., Boonma, P., Natwichai, J.: An efficient algorithm for density-balanced partitioning in distributed pagerank. In: 2014 9th International Conference on Digital Information Management, ICDIM 2014, pp. 118-123 (2014)
[15] Sangamuang, S.; Boonma, P.; Natwichai, J., An Algorithm for Min-Cut Density-Balanced Partitioning in P2P Web Ranking, 257-266 (2015), Cham: Springer, Cham
[16] Schloegel, K.; Karypis, G.; Kumar, V., A New Algorithm for Multi-objective Graph Partitioning, 322-331 (1999), Berlin: Springer, Berlin
[17] Shi, S., Yu, J., Yang, G., Wang, D.: Distributed page ranking in structured p2p networks. In: Proceedings of the 2003 International Conference on Parallel Processing (2003)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.