The Community for Technology Leaders
RSS Icon
Subscribe
Issue No.06 - June (2008 vol.20)
pp: 809-824
ABSTRACT
The importance of query processing over uncertain data has recently arisen due to its wide usage in many real-world applications. In the context of uncertain databases, previous work have studied many query types such as nearest neighbor query, range query, top-$k$ query, skyline query, and similarity join. In this paper, we focus on another important query, namely probabilistic group nearest neighbor query (PGNN), in the uncertain database, which also has many applications. Specifically, given a set, Q, of query points, a PGNN query retrieves data objects that minimize the aggregate distance (e.g. sum, min, and max) to query set Q. Due to the inherent uncertainty of data objects, previous techniques to answer group nearest neighbor query (GNN) cannot be directly applied to our PGNN problem. Motivated by this, we propose effective pruning methods, namely spatial pruning and probabilistic pruning, to reduce the PGNN search space, which can be seamlessly integrated into our PGNN query procedure. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed approach, in terms of the wall clock time and the speed-up ratio against linear scan.
INDEX TERMS
Query processing, Search process
CITATION
Xiang Lian, Lei Chen, "Probabilistic Group Nearest Neighbor Queries in Uncertain Databases", IEEE Transactions on Knowledge & Data Engineering, vol.20, no. 6, pp. 809-824, June 2008, doi:10.1109/TKDE.2008.41
REFERENCES
[1] C. Böhm, A. Pryakhin, and M. Schubert, “The Gauss-Tree: Efficient Object Identification in Databases of Probabilistic Feature Vectors,” Proc. 22nd Int'l Conf. Data Eng. (ICDE), 2006.
[2] L. Chen, M.T. Özsu, and V. Oria, “Robust and Fast Similarity Search for Moving Object Trajectories,” Proc. ACM SIGMOD, 2005.
[3] R. Cheng, D. Kalashnikov, and S. Prabhakar, “Querying Imprecise Data in Moving Object Environments,” IEEE Trans. Knowledge and Data Eng., vol. 16, no. 9, pp. 1112-1127, Sept. 2004.
[4] R. Cheng, D.V. Kalashnikov, and S. Prabhakar, “Evaluating Probabilistic Queries over Imprecise Data,” Proc. ACM SIGMOD, 2003.
[5] R. Cheng, Y. Xia, S. Prabhakar, R. Shah, and J. Vitter, “Efficient Indexing Methods for Probabilistic Threshold Queries over Uncertain Data,” Proc. 30th Int'l Conf. Very Large Data Bases (VLDB), 2004.
[6] P. Ciaccia, M. Patella, and P. Zezula, “M-Tree: An Efficient Access Method for Similarity Search in Metric Spaces,” Proc. 23rd Int'l Conf. Very Large Data Bases (VLDB), 1997.
[7] R. Fagin, A. Lotem, and M. Naor, “Optimal Aggregation Algorithms for Middleware,” Proc. 20th ACM SIGMOD-SIGACT-SIGART Symp. Principles of Database Systems (PODS), 2001.
[8] A. Faradjian, J. Gehrke, and P. Bonnet, “Gadt: A Probability Space ADT for Representing and Querying the Physical World,” Proc. 18th Int'l Conf. Data Eng. (ICDE), 2002.
[9] A. Guttman, “R-Trees: A Dynamic Index Structure for Spatial Searching,” Proc. ACM SIGMOD, 1984.
[10] S. Hochreiter, A.S. Younger, and P.R. Conwell, “Learning to Learn Using Gradient Descent,” Proc. Int'l Conf. Artificial Neural Networks (ICANN), 2001.
[11] E. Hung, Y. Deng, and V.S. Subrahmanian, “RDF Aggregate Queries and Views,” Proc. 21st Int'l Conf. Data Eng. (ICDE), 2005.
[12] N. Katayama and S. Satoh, “The SR-Tree: An Index Structure for High-Dimensional Nearest Neighbor Queries,” Proc. ACM SIGMOD, 1997.
[13] H.-P. Kriegel, P. Kunath, M. Pfeifle, and M. Renz, “Probabilistic Similarity Join on Uncertain Data,” Proc. 11th Int'l Conf. Database Systems for Advanced Applications (DASFAA), 2006.
[14] H.-P. Kriegel, P. Kunath, and M. Renz, “Probabilistic Nearest-Neighbor Query on Uncertain Objects,” Proc. 12th Int'l Conf. Database Systems for Advanced Applications (DASFAA), 2007.
[15] V. Ljosa and A.K. Singh, “APLA: Indexing Arbitrary Probability Distributions,” Proc. 23rd Int'l Conf. Data Eng. (ICDE), 2007.
[16] S. Madden, M.J. Franklin, J.M. Hellerstein, and W. Hong, “TAG: A Tiny Aggregation Service for Ad-Hoc Sensor Networks,” Proc. Fifth Symp. Operating Systems Design and Implementation (OSDI), 2002.
[17] D. Papadias, Q. Shen, Y. Tao, and K. Mouratidis, “Group Nearest Neighbor Queries,” Proc. 20th Int'l Conf. Data Eng. (ICDE), 2004.
[18] D. Papadias, Y. Tao, K. Mouratidis, and C. Hui, “Aggregate Nearest Neighbor Queries in Spatial Databases,” ACM Trans. Database System, vol. 30, no. 2, pp. 529-576, 2005.
[19] J. Pei, B. Jiang, X. Lin, and Y. Yuan, “Probabilistic Skylines on Uncertain Data,” Proc. 33rd Int'l Conf. Very Large Data Bases (VLDB), 2007.
[20] C. Re, N. Dalvi, and D. Suciu, “Efficient ${\rm Top}\hbox{-}k$ Query Evaluation on Probabilistic Data,” Proc. 23rd Int'l Conf. Data Eng. (ICDE), 2007.
[21] A.D. Sarma, O. Benjelloun, A.Y. Halevy, and J. Widom, “Working Models for Uncertain Data,” Proc. 22nd Int'l Conf. Data Eng. (ICDE), 2006.
[22] A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, “Content-Based Image Retrieval at the End of the Early Years,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, no. 12, pp. 1349-1380, Dec. 2000.
[23] M.A. Soliman, I.F. Ilyas, and K.C. Chang, “${\rm Top}\hbox{-}k$ Query Processing in Uncertain Databases,” Proc. 23rd Int'l Conf. Data Eng. (ICDE), 2007.
[24] Y. Tao, R. Cheng, X. Xiao, W.K. Ngai, and S. Prabhakar, “Indexing Multi-Dimensional Uncertain Data with Arbitrary Probability Density Functions,” Proc. 31st Int'l Conf. Very Large Data Bases (VLDB), 2005.
[25] Y. Tao, D. Papadias, and X. Lian, “Reverse $k{\rm NN}$ Search in Arbitrary Dimensionality,” Proc. 30th Int'l Conf. Very Large Data Bases (VLDB), 2004.
[26] Y. Tao, D. Papadias, X. Lian, and X. Xiao, “Multidimensional Reverse $k{\rm NN}$ Search,” The VLDB J., vol. 16, no. 3, pp. 293-316, 2007.
23 ms
(Ver 2.0)

Marketing Automation Platform Marketing Automation Tool