×

Adaptive searching in succinctly encoded binary relations and tree-structured documents. (English) Zbl 1144.68014

Summary: The methods most heavily used by search engines to answer conjunctive queries on binary relations (such as one associating keywords with web-pages) are based on computing the intersection of postings lists stored as sorted arrays and using variants of binary search. We show that a succinct representation of the binary relation permits much better results, while using less space than traditional methods. We apply our results not only to conjunctive queries on binary relations, but also to queries on semi-structured documents such as XML documents or file-system indexes, using a variant of an adaptive algorithm used to solve conjunctive queries on binary relations.

MSC:

68P05 Data structures
68P10 Searching and sorting
68P20 Information storage and retrieval of data
Full Text: DOI

References:

[1] Baeza-Yates, R. A., A fast set intersection algorithm for sorted sequences, (Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching. Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching, Lecture Notes in Computer Science, vol. 3109 (2004), Springer), 400-408 · Zbl 1103.68485
[2] J. Barbay, C. Kenyon, Alternation and redundancy analysis of the intersection problem, ACM Transactions on Algorithms 4 (1) (2008) 18 pp. (in press); J. Barbay, C. Kenyon, Alternation and redundancy analysis of the intersection problem, ACM Transactions on Algorithms 4 (1) (2008) 18 pp. (in press) · Zbl 1445.68338
[3] Barbay, J.; López-Ortiz, A.; Lu, T., Faster adaptive set intersections for text searching, (5th International Workshop on Experimental Algorithms. 5th International Workshop on Experimental Algorithms, Lecture Notes in Computer Science, vol. 4007 (2006), Springer), 146-157
[4] Benoit, D.; Demaine, E. D.; Munro, J. I.; Raman, R.; Raman, V.; Rao, S. S., Representing trees of higher degree, Algorithmica, 275-292 (2005) · Zbl 1086.68034
[5] D.R. Clark, J.I. Munro, Efficient suffix trees on secondary storage, in: Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms, 1996, pp. 383-391; D.R. Clark, J.I. Munro, Efficient suffix trees on secondary storage, in: Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms, 1996, pp. 383-391 · Zbl 0847.68030
[6] E.D. Demaine, A. López-Ortiz, J.I. Munro, Adaptive set intersections, unions, and differences, in: Proceedings of the 11th ACM-SIAM Symposium on Discrete Algorithms, 2000, pp. 743-752; E.D. Demaine, A. López-Ortiz, J.I. Munro, Adaptive set intersections, unions, and differences, in: Proceedings of the 11th ACM-SIAM Symposium on Discrete Algorithms, 2000, pp. 743-752 · Zbl 0957.68124
[7] Demaine, E. D.; López-Ortiz, A.; Munro, J. I., Experiments on adaptive set intersections for text retrieval systems, (Proceedings of the 3rd Workshop on Algorithm Engineering and Experiments. Proceedings of the 3rd Workshop on Algorithm Engineering and Experiments, Lecture Notes in Computer Science, vol. 2153 (2001), Springer), 91-104 · Zbl 1010.68741
[8] Estivill-Castro, V.; Wood, D., A survey of adaptive sorting algorithms, ACM Computing Surveys, 24, 4, 441-476 (1992)
[9] P. Ferragina, F. Luccio, G. Manzini, S. Muthukrishnan, Structuring labeled trees for optimal succinctness, and beyond, in: Proceedings of the 46th IEEE Symposium on Foundations of Computer Science, 2005, pp. 184-196; P. Ferragina, F. Luccio, G. Manzini, S. Muthukrishnan, Structuring labeled trees for optimal succinctness, and beyond, in: Proceedings of the 46th IEEE Symposium on Foundations of Computer Science, 2005, pp. 184-196
[10] R.F. Geary, R. Raman, V. Raman, Succinct ordinal trees with level-ancestor queries, in: Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms, 2004, pp. 1-10; R.F. Geary, R. Raman, V. Raman, Succinct ordinal trees with level-ancestor queries, in: Proceedings of the 15th Annual ACM-SIAM Symposium on Discrete Algorithms, 2004, pp. 1-10 · Zbl 1317.68043
[11] A. Golynski, J.I. Munro, S.S. Rao, Rank/select operations on large alphabets: A tool for text indexing, in: Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, 2006, pp. 368-373; A. Golynski, J.I. Munro, S.S. Rao, Rank/select operations on large alphabets: A tool for text indexing, in: Proceedings of the 17th Annual ACM-SIAM Symposium on Discrete Algorithms, 2006, pp. 368-373 · Zbl 1192.68800
[12] R. Grossi, A. Gupta, J.S. Vitter, High-order entropy-compressed text indexes, in: Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms, 2003, pp. 841-850; R. Grossi, A. Gupta, J.S. Vitter, High-order entropy-compressed text indexes, in: Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms, 2003, pp. 841-850 · Zbl 1092.68584
[13] G. Jacobson, Space-efficient static trees and graphs, in: Proceedings of the 30th Annual Symposium on Foundations of Computer Science, 1989, pp. 549-554; G. Jacobson, Space-efficient static trees and graphs, in: Proceedings of the 30th Annual Symposium on Foundations of Computer Science, 1989, pp. 549-554
[14] W.-K.S. Jesper Jansson, Kunihiko Sadakane, Ultra-succinct representation of ordered trees, in: Proceedings of the 18th annual ACM-SIAM Symposium on Discrete Algorithms, 2007; W.-K.S. Jesper Jansson, Kunihiko Sadakane, Ultra-succinct representation of ordered trees, in: Proceedings of the 18th annual ACM-SIAM Symposium on Discrete Algorithms, 2007 · Zbl 1302.68100
[15] Kirkpatrick, D. G.; Seidel, R., The ultimate planar convex hull algorithm?, SIAM Journal on Computing, 15, 1, 287-299 (1986) · Zbl 0589.68035
[16] Munro, J. I.; Raman, V., Succinct representation of balanced parentheses and static trees, SIAM Journal on Computing, 31, 3, 762-776 (2001) · Zbl 1017.68037
[17] Munro, J. I.; Rao, S. S., Succinct representations of functions, (31st International Colloquium on Automata, Languages and Programming. 31st International Colloquium on Automata, Languages and Programming, Lecture Notes in Computer Science (2004), Springer), 1006-1015 · Zbl 1099.68603
[18] Neumann, J. V.; Morgenstern, O., Theory of Games and Economic Behavior (1944), Princeton University Press · Zbl 0063.05930
[19] R. Raman, V. Raman, S.S. Rao, Succinct indexable dictionaries with applications to encoding k-ary trees and multisets, in: Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete algorithms, 2002, pp. 233-242; R. Raman, V. Raman, S.S. Rao, Succinct indexable dictionaries with applications to encoding k-ary trees and multisets, in: Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete algorithms, 2002, pp. 233-242 · Zbl 1093.68582
[20] Sion, M., On general minimax theorems, Pacific Journal of Mathematics, 8, 171-176 (1958) · Zbl 0081.11502
[21] Willard, D. E., Log-logarithmic worst-case range queries are possible in space \(\Theta(N)\), Information Processing Letters, 17, 2, 81-84 (1983) · Zbl 0509.68106
[22] A.C. Yao, Probabilistic computations: Toward a unified measure of complexity, in: Proceedings of the 18th IEEE Symposium on Foundations of Computer Science, 1977, pp. 222-227; A.C. Yao, Probabilistic computations: Toward a unified measure of complexity, in: Proceedings of the 18th IEEE Symposium on Foundations of Computer Science, 1977, pp. 222-227
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.