Subscribe

Issue No.05 - May (2013 vol.25)

pp: 1015-1027

Shen Ge , Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China

Leong Hou U , Dept. of Comput. & Inf. Sci., Univ. of Macau, Macau, China

N. Mamoulis , Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China

D. W. Cheung , Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong, China

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2012.34

ABSTRACT

Given a set of objects P and a set of ranking functions F over P, an interesting problem is to compute the top ranked objects for all functions. Evaluation of multiple top-k queries finds application in systems, where there is a heavy workload of ranking queries (e.g., online search engines and product recommendation systems). The simple solution of evaluating the top-k queries one-by-one does not scale well; instead, the system can make use of the fact that similar queries share common results to accelerate search. This paper is the first, to our knowledge, thorough study of this problem. We propose methods that compute all top-k queries in batch. Our first solution applies the block indexed nested loops paradigm, while our second technique is a view-based algorithm. We propose appropriate optimization techniques for the two approaches and demonstrate experimentally that the second approach is consistently the best. Our approach facilitates evaluation of other complex queries that depend on the computation of multiple top-k queries, such as reverse top-k and top-m influential queries. We show that our batch processing technique for these complex queries outperform the state-of-the-art by orders of magnitude.

INDEX TERMS

Casting, Linear programming, Vectors, Batch production systems, Partitioning algorithms, Indexes, Spatial databases,view-based index, All top-$(k)$ queries

CITATION

Shen Ge, Leong Hou U, N. Mamoulis, D. W. Cheung, "Efficient All Top-k Computation - A Unified Solution for All Top-k, Reverse Top-k and Top-m Influential Queries",

*IEEE Transactions on Knowledge & Data Engineering*, vol.25, no. 5, pp. 1015-1027, May 2013, doi:10.1109/TKDE.2012.34REFERENCES

- [1] S. Chaudhuri and L. Gravano, "Evaluating Top-$k$ Selection Queries,"
Proc. 25th Int'l Conf. Very Large Data Bases (VLDB), pp. 397-410, 1999.- [2] R. Fagin, A. Lotem, and M. Naor, "Optimal Aggregation Algorithms for Middleware,"
Proc. 20th ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems (PODS), 2001.- [3] Y. Tao, D. Papadias, V. Hristidis, and Y. Papakonstantinou, "Branch-and-Bound Processing of Ranked Queries,"
Information Systems, vol. 32, pp. 424-445, 2007.- [4] S. Börzsönyi, D. Kossmann, and K. Stocker, "The Skyline Operator,"
Proc. 17th Int'l Conf. Data Eng. (ICDE), pp. 421-430, 2001.- [5] D. Papadias, Y. Tao, G. Fu, and B. Seeger, "Progressive Skyline Computation in Database Systems,"
ACM Trans. Database Systems, vol. 30, no. 1, pp. 41-82, 2005.- [6] Q. Wan, R.C.-W. Wong, I.F. Ilyas, M.T. Özsu, and Y. Peng, "Creating Competitive Products,"
Proc. PVLDB Endowment, vol. 2, no. 1, pp. 898-909, 2009.- [7] T. Wu, D. Xin, Q. Mei, and J. Han, "Promotion Analysis in Multi-Dimensional Space,"
Proc. PVLDB Endowment, vol. 2, no. 1, pp. 109-120, 2009.- [8] A. Vlachou, C. Doulkeridis, Y. Kotidis, and K. Nørvåg, "Monochromatic and Bichromatic Reverse Top-K Queries,"
IEEE Trans. Knowledge Data Eng., vol. 23, no. 8, pp. 1215-1229, Aug. 2011.- [9] A. Vlachou, C. Doulkeridis, K. Nørvåg, and Y. Kotidis, "Identifying the Most Influential Data Objects with Reverse Top-$k$ Queries,"
Proc. PVLDB Endowment, vol. 3, no. 1, pp. 364-372, 2010.- [10] G. Das, D. Gunopulos, N. Koudas, and D. Tsirogiannis, "Answering Top-k Queries Using Views,"
Proc. 32nd In'l Conf. Very Large Data Bases, pp. 451-462. 2006,- [11] Retrevo Survey, http://www.retrevo.com/content/blog/2010/ 11holiday-shopping-trends-and-black-friday-special-report , 2012.
- [12] J. Zhang, N. Mamoulis, D. Papadias, and Y. Tao, "All-Nearest-Neighbors Queries in Spatial Databases,"
Proc. 16th Int'l Conf. Scientific and Statistical Database Management (SSDBM), pp. 297-306. 2004,- [13] N. Beckmann, H.-P. Kriegel, R. Schneider, and B. Seeger, "The R∗-Tree: An Efficient and Robust Access Method for Points and Rectangles,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 322-331, 1990.- [14] T. Bially, "Space-Filling Curves: Their Generation and Their Application to Bandwidth Reduction,"
IEEE Trans. Information Theory, vol. TIT-15, no. 6, pp. 658-664, Nov. 1969.- [15] A.Y. Halevy, "Answering Queries Using Views: A Survey,"
Proc. VLDB J., vol. 10, no. 4, pp. 270-294, 2001.- [16] A. Packer, "NP-Hardness of Largest Contained and Smallest Containing Simplices for V- and H-Polytopes,"
Discrete and Computational Geometry, vol. 28, no. 3, pp. 349-377, 2002.- [17] J. Munkres,
Elements of Algebraic Topology, second ed., ch. 1.1, Prentice Hall, Jan. 1984.- [18] NBA Basketball Statistics http:/www.databasebasketball.com/, 2012.
- [19] Household Data Set, http:/www.ipums.org/, 2012.
- [20] Y.-C. Chang, L.D. Bergman, V. Castelli, C.-S. Li, M.-L. Lo, and J.R. Smith, "The Onion Technique: Indexing for Linear Optimization Queries,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 391-402. 2000,- [21] V. Hristidis, N. Koudas, and Y. Papakonstantinou, "PREFER: A System for the Efficient Execution of Multi-parametric Ranked Queries,"
Proc. ACM SIGMOD Int'l Conf. Management of Data, pp. 259-270. 2001,- [22] V. Hristidis and Y. Papakonstantinou, "Algorithms and Applications for Answering Ranked Queries Using Ranked Views,"
Proc. VLDB J., vol. 13, no. 1, pp. 49-70, 2004.- [23] D. Xin, C. Chen, and J. Han, "Towards Robust Indexing for Ranked Queries,"
Proc. 32nd Int'l Conf. Very Large Data Bases (VLDB), pp. 235-246. 2006,- [24] S. Amer-Yahia, S.B. Roy, A. Chawla, G. Das, and C. Yu, "Group Recommendation: Semantics and Efficiency,"
Proc. PVLDB Endowment, vol. 2, no. 1, pp. 754-765, 2009.- [25] S. Zhang, N. Mamoulis, and D.W. Cheung, "Scalable Skyline Computation Using Object-Based Space Partitioning,"
Proc. SIGMOD Scalable Skyline Computation Using Object-Based Space Partitioning Conf., pp. 483-494. 2009,- [26] C. Li, B.C. Ooi, A.K.H. Tung, and S. Wang, "DADA: A Data Cube for Dominant Relationship Analysis,"
Proc. ACM SIGMOD Int'l Conf. Management of Data Conf., pp. 659-670. 2006,- [27] M. Miah, G. Das, V. Hristidis, and H. Mannila, "Standing Out in a Crowd: Selecting Attributes for Maximum Visibility,"
Proc. Int'l Conf. Data Mining (ICDE), pp. 356-365. 2008, |