Issue No. 08 - August (2010 vol. 22)

ISSN: 1041-4347

pp: 1158-1175

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TKDE.2010.75

Hanan Samet , University of Maryland, College Park, MD

Jagan Sankaranarayanan , University of Maryland, College Park, MD

ABSTRACT

The popularity of location-based services and the need to do real-time processing on them has led to an interest in performing queries on transportation networks, such as finding shortest paths and finding nearest neighbors. The challenge here is that the efficient execution of spatial operations usually involves the computation of distance along a spatial network instead of "as the crow flies,” which is not simple. Techniques are described that enable the determination of the network distance between any pair of points (i.e., vertices) with as little as O(n) space rather than having to store the n^2 distances between all pairs. This is done by being willing to expend a bit more time to achieve this goal such as O(\log n) instead of O(1), as well as by accepting an error \varepsilon in the accuracy of the distance that is provided. The strategy that is adopted reduces the space requirements and is based on the ability to identify groups of source and destination vertices for which the distance is approximately the same within some \varepsilon. The reductions are achieved by introducing a construct termed a distance oracle that yields an estimate of the network distance (termed the \varepsilon-approximate distance) between any two vertices in the spatial network. The distance oracle is obtained by showing how to adapt the well-separated pair technique from computational geometry to spatial networks. Initially, an \varepsilon-approximate distance oracle of size O({n\over \varepsilon^2} ) is used that is capable of retrieving the approximate network distance in O(\log n) time using a B-tree. The retrieval time can be theoretically reduced further to O(1) time by proposing another \varepsilon-approximate distance oracle of size O({n \log n\over \varepsilon^2} ) that uses a hash table. Experimental results indicate that the proposed technique is scalable and can be applied to sufficiently large road networks. For example, a 10-percent-approximate oracle (\varepsilon = 0.1) on a large network yielded an average error of 0.9 percent with 90 percent of the answers having an error of 2 percent or less and an average retrieval time of 68 \mu {\rm seconds}. The fact that the network distance can be approximated by one value is used to show how a number of spatial queries can be formulated using appropriate SQL constructs and a few built-in primitives. The result is that these operations can be executed on almost any modern database with no modifications, while taking advantage of the existing query optimizers and query processing strategies.

INDEX TERMS

Road networks, distance oracle, query processing.

CITATION

Hanan Samet, Jagan Sankaranarayanan, "Query Processing Using Distance Oracles for Spatial Networks",

*IEEE Transactions on Knowledge & Data Engineering*, vol. 22, no. , pp. 1158-1175, August 2010, doi:10.1109/TKDE.2010.75