Issue No.03 - March (2010 vol.22)
Xiang Lian , Hong Kong University of Science and Technology, Hong Kong
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2009.112
Recently, many new applications, such as sensor data monitoring and mobile device tracking, raise up the issue of uncertain data management. Compared to "certain” data, the data in the uncertain database are not exact points, which, instead, often reside within a region. In this paper, we study the ranked queries over uncertain data. In fact, ranked queries have been studied extensively in traditional database literature due to their popularity in many applications, such as decision making, recommendation raising, and data mining tasks. Many proposals have been made in order to improve the efficiency in answering ranked queries. However, the existing approaches are all based on the assumption that the underlying data are exact (or certain). Due to the intrinsic differences between uncertain and certain data, these methods are designed only for ranked queries in certain databases and cannot be applied to uncertain case directly. Motivated by this, we propose novel solutions to speed up the probabilistic ranked query (PRank) with monotonic preference functions over the uncertain database. Specifically, we introduce two effective pruning methods, spatial and probabilistic pruning, to help reduce the PRank search space. A special case of PRank with linear preference functions is also studied. Then, we seamlessly integrate these pruning heuristics into the PRank query procedure. Furthermore, we propose and tackle the PRank query processing over the join of two distinct uncertain databases. Extensive experiments have demonstrated the efficiency and effectiveness of our proposed approaches in answering PRank queries, in terms of both wall clock time and the number of candidates to be refined.
Probabilistic ranked query, probabilistic ranked query on join, PRank, J-PRank, uncertain database.
Xiang Lian, "Ranked Query Processing in Uncertain Databases", IEEE Transactions on Knowledge & Data Engineering, vol.22, no. 3, pp. 420-436, March 2010, doi:10.1109/TKDE.2009.112