This Article 
   
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Distributed Skyline Retrieval with Low Bandwidth Consumption
March 2009 (vol. 21 no. 3)
pp. 384-400
Lin Zhu, Fudan University, Shanghai
Yufei Tao, Chinese University of Hong Kong, Hong Kong
Shuigeng Zhou, Fudan University, Shanghai
We consider skyline computation when the underlying dataset is horizontally partitioned onto geographically distant servers \color{red} that are connected to the Internet. \color{black} The existing solutions are not suitable for our problem, because they have at least one of the following drawbacks: (i) applicable only to distributed systems adopting vertical partitioning or restricted horizontal partitioning, (ii) effective only when each server has limited computing and communication abilities, and (iii) optimized only for skyline search in subspaces but inefficient in the full space. This paper proposes an algorithm, called {\em feedback-based distributed skyline} (FDS), to support arbitrary horizontal partitioning. \color{red} FDS aims at minimizing the network bandwidth, measured in the number of tuples transmitted over the network. \color{black} The core of FDS is a novel feedback-driven mechanism, where the coordinator iteratively transmits certain feedback to each participant. Participants can leverage such information to prune a large amount of local data, which otherwise would need to be sent to the coordinator. Extensive experimentation confirms that FDS significantly outperforms alternative approaches in both effectiveness and progressiveness.

[1] W.-T. Balke, U. Guntzer, and J.X. Zheng, “Efficient Distributed Skylining for Web Information Systems,” Proc. Ninth Int'l Conf. Extending Database Technology (EDBT '04), pp. 256-273, 2004.
[2] W.-T. Balke, W. Nejdl, W. Siberski, and U. Thaden, “Progressive Distributed Peer-to-Peer Top-$k$ Retrieval in Peer-to-Peer Networks,” Proc. 21st Int'l Conf. Data Eng. (ICDE '05), pp. 174-185, 2005.
[3] J.L. Bentley, H.T. Kung, and M. Schkolnick, “On the Average Number of Maxima in a Set of Vectors and Applications,” J. ACM, vol. 25, no. 4, pp. 536-543, 1978.
[4] S. Borzsonyi, D. Kossmann, and K. Stocker, “The Skyline Operator,” Proc. 17th Int'l Conf. Data Eng. (ICDE '01), pp. 421-430, 2001.
[5] C.-Y. Chan, P.-K. Eng, and K.-L. Tan, “Stratified Computation of Skylines with Partially-Ordered Domains,” Proc. ACM SIGMOD '05, pp. 203-214, 2005.
[6] C.-Y. Chan, H.V. Jagadish, K.-L. Tan, A.K.H. Tung, and Z. Zhang, “Finding $K$ -Dominant Skylines in High Dimensional Space,” Proc. ACM SIGMOD '06, pp. 503-514, 2006.
[7] C.Y. Chan, H.V. Jagadish, K.-L. Tan, A.K.H. Tung, and Z. Zhang, “On High Dimensional Skylines,” Proc. 10th Int'l Conf. Extending Database Technology (EDBT '06), pp. 478-495, 2006.
[8] S. Chaudhuri, N. Dalvi, and R. Kaushik, “Robust Cardinality and Cost Estimation for Skyline Operator,” Proc. 22nd Int'l Conf. Data Eng. (ICDE), 2006.
[9] S. Chaudhuri, G. Das, V. Hristidis, and G. Weikum, “Probabilistic Ranking of Database Query Results,” Proc. 30th Int'l Conf. Very Large Data Bases (VLDB '04), pp. 888-899, 2004.
[10] J. Chomicki, P. Godfrey, and J. Gryz, “Skyline with Presorting,” Proc. 19th Int'l Conf. Data Eng. (ICDE '03), pp. 717-816, 2003.
[11] E. Dellis and B. Seeger, “Efficient Computation of Reverse Skyline Queries,” Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB '07), pp. 291-302, 2007.
[12] K. Deng, X. Zhou, and H.T. Shen, “Multi-Source Skyline Query Processing in Road Networks,” Proc. 24th Int'l Conf. Data Eng. (ICDE '07), pp. 796-805, 2007.
[13] R. Fagin, A. Lotem, and M. Naor, “Optimal Aggregation Algorithms for Middleware,” J. Computer and System Sciences, vol. 66, pp. 614-656, 2003.
[14] P. Godfrey, “Skyline Cardinality for Relational Processing,” Proc. Third Int'l Symp. Foundations of Information and Knowledge Systems (FoIKS '04), pp. 78-97, 2004.
[15] P. Godfrey, R. Shipley, and J. Gryz, “Maximal Vector Computation in Large Data Sets,” Proc. 31st Int'l Conf. Very Large Data Bases (VLDB '05), pp. 229-240, 2005.
[16] Z. Huang, C.S. Jensen, H. Lu, and B.C. Ooi, “Skyline Queries against Mobile Lightweight Devices in Manets,” Proc. 22nd Int'l Conf. Data Eng. (ICDE), 2006.
[17] H.V. Jagadish, B.C. Ooi, and Q.H. Vu, “Baton: A Balanced Tree Structure for Peer-to-Peer Networks,” Proc. 31st Int'l Conf. Very Large Data Bases (VLDB '05), pp. 661-672, 2005.
[18] D. Kossmann, F. Ramsak, and S. Rost, “Shooting Stars in the Sky: An Online Algorithm for Skyline Queries,” Proc. 28th Int'l Conf. Very Large Data Bases (VLDB '02), pp. 275-286, 2002.
[19] H.T. Kung, F. Luccio, and F.P. Preparata, “On Finding the Maxima of a Set of Vectors,” J. ACM, vol. 22, no. 4, pp. 469-476, 1975.
[20] K. Lee, B. Zheng, H. Li, and W.-C. Lee, “Approaching the Skyline in $Z$ Order,” Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB '07), pp. 279-290, 2007.
[21] C. Li, B.C. Ooi, A.K.H. Tung, and S. Wang, “Dada: A Data Cube for Dominant Relationship Analysis,” Proc. ACM SIGMOD '06, pp. 659-670, 2006.
[22] X. Lin, Y. Yuan, W. Wang, and H. Lu, “Stabbing the Sky: Efficient Skyline Computation over Sliding Windows,” Proc. 21st Int'l Conf. Data Eng. (ICDE), 2005.
[23] X. Lin, Y. Yuan, Q. Zhang, and Y. Zhang, “Selecting Stars: The $K$ Most Representative Skyline Operator,” Proc. 23rd Int'l Conf. Data Eng. (ICDE '07), pp. 86-95, 2007.
[24] M. Morse, J.M. Patel, and H.V. Jagadish, “Efficient Skyline Computation over Low-Cardinality Domains,” Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB '07), pp. 267-278, 2007.
[25] D. Papadias, Y. Tao, G. Fu, and B. Seeger, “An Optimal and Progressive Algorithm for Skyline Queries,” Proc. ACM SIGMOD '03, pp. 467-478, 2003.
[26] J. Pei, A.W.-C. Fu, X. Lin, and H. Wang, “Computing Compressed Multidimensional Skyline Cubes Efficiently,” Proc. 23rd Int'l Conf. Data Eng. (ICDE '07), pp. 96-105, 2007.
[27] J. Pei, B. Jiang, X. Lin, and Y. Yuan, “Probabilistic Skylines onUncertain Data,” Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB '07), pp. 15-26, 2007.
[28] J. Pei, W. Jin, M. Ester, and Y. Tao, “Catching the Best Views of Skyline: A Semantic Approach Based on Decisive Subspaces,” Proc. 31st Int'l Conf. Very Large Data Bases (VLDB '05), pp. 253-564, 2005.
[29] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Schenker, “AScalable Content-Addressable Network,” Proc. ACM SIGCOMM'01, pp. 161-172, 2001.
[30] M. Sharifzadeh and C. Shahabi, “The Spatial Skyline Query,” Proc. 32nd Int'l Conf. Very Large Data Bases (VLDB), 2006.
[31] K.-L. Tan, P.-K. Eng, and B.C. Ooi, “Efficient Progressive Skyline Computation,” Proc. 17th Int'l Conf. Data Eng. (ICDE '01), pp.301-310, 2001.
[32] Y. Tao and D. Papadias, “Maintaining Sliding Window Skylines on Data Streams,” IEEE Trans. Knowledge and Data Eng., vol. 18, no. 2, pp. 377-391, Feb. 2006.
[33] Y. Tao, X. Xiao, and J. Pei, “Subsky: Efficient Computation of Skylines in Subspaces,” Proc. 22nd Int'l Conf. Data Eng. (ICDE), 2006.
[34] M. Theobald, G. Weikum, and R. Schenkel, “Top-$k$ Query Evaluation with Probabilistic Guarantees,” Proc. 30th Int'l Conf. Very Large Data Bases (VLDB '04), pp. 648-659, 2004.
[35] A. Vlachou, C. Doulkeridis, Y. Kotidis, and M. Vazirgiannis, “Skypeer: Efficient Subspace Skyline Computation over Distributed Data,” Proc. 23rd Int'l Conf. Data Eng. (ICDE '07), pp. 416-425, 2007.
[36] S. Wang, B.C. Ooi, A.K.H. Tung, and L. Xu, “Efficient Skyline Query Processing on Peer-to-Peer Networks,” Proc. 23rd Int'l Conf. Data Eng. (ICDE '07), pp. 1126-1135, 2007.
[37] P. Wu, D. Agrawal, Ö. Egecioglu, and A.E. Abbadi, “Deltasky: Optimal Maintenance of Skyline Deletions without Exclusive Dominance Region Generation,” Proc. 23rd Int'l Conf. Data Eng. (ICDE '07), pp. 486-495, 2007.
[38] P. Wu, C. Zhang, and Y. Feng, “Parallelizing Skyline Queries forScalable Distribution,” Proc. Int'l Conf. Extending Database Technology (EDBT '05), pp. 112-130, 2005.
[39] T. Xia and D. Zhang, “Refreshing the Sky: The Compressed Skycube with Efficient Support for Frequent Updates,” Proc. ACM SIGMOD '06, pp. 491-502, 2006.
[40] H. Yu, H.-G. Li, P. Wu, D. Agrawal, and A.E. Abbadi, “Efficient Processing of Distributed Top-Queries,” Proc. 16th Int'l Conf. Database and Expert Systems Applications (DEXA '05), pp. 65-74, 2005.
[41] Y. Yuan, X. Lin, Q. Liu, W. Wang, J.X. Yu, and Q. Zhang, “Efficient Computation of the Skyline Cube,” Proc. 31st Int'l Conf. Very Large Data Bases (VLDB '05), pp. 241-252, 2005.
[42] K. Zhao, Y. Tao, and S. Zhou, “Efficient Top-$k$ Processing in Large-Scaled Distributed Environments,” Data and Knowledge Eng., vol. 63, no. 2, pp. 315-335, 2007.

Index Terms:
Distributed databases, Spatial databases
Citation:
Lin Zhu, Yufei Tao, Shuigeng Zhou, "Distributed Skyline Retrieval with Low Bandwidth Consumption," IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 3, pp. 384-400, March 2009, doi:10.1109/TKDE.2008.142
Usage of this product signifies your acceptance of the Terms of Use.